In my two previous AI-related blog posts, I explained how you can generate accessible descriptions for images, charts and hyperlinks used within Word and Excel documents using DevExpress Office File API libraries and the Azure AI OpenAI service:
- Office File API — Enhance Accessibility in Office Documents using OpenAI Models (Part 1)
- Office File API — Enhance Accessibility in Office Documents using OpenAI Models (Part 2)
In this post, I’ll detail another important tip for accessible document generation - detecting and setting the proof language for paragraphs in a multi-language Word document. In addition, I'll describe how to translate multi-language documents and generate accessible comments with paragraph translations to the chosen language.
As you may know, there are several reasons why it’s essential to specify proof language and add translations for text paragraphs when generating accessible Word documents (especially for multi-language documents). Language settings aid screen readers in interpreting and vocalizing text. Accessibility features and tools such as spell/grammar checkers rely on these settings to provide accurate suggestions and corrections. Language settings are also crucial when exporting Word documents to accessible PDF formats. And comments with paragraph translations help add clarity to document content, contribute to document accessiblity, and enhance collaboration in multi-language environments.
To address these particular requirements, we will use the DevExpress Word Processing Document library and two Azure AI services - Language and Translator. Please review the new endpoint for these capabilities in our sample project on GitHub: Office File API – Integrate AI.
Implementation Details
Use Azure AI Language APIs
The Azure AI Language service requires an Azure subscription. Once subscribed, create a Language resource in the Azure portal to obtain your key and endpoint.
Install the Azure.AI.TextAnalytics NuGet package to your project to use Language service APIs for text language detection.
The following code snippet authenticates TextAnalyticsClient
, sends text to the service, and the service returns information about the detected language. For this implementation, we only need language name in "ISO 693-1" format.
public class AzureAILanguageHelper
{
private TextAnalyticsClient client;
internal AzureAILanguageHelper(string key, Uri endpoint)
{
AzureKeyCredential azureCredential = new AzureKeyCredential(key);
client = new TextAnalyticsClient(endpoint, azureCredential);
}
internal async Task<string> DetectTextLanguage(string text)
{
DetectedLanguage detectedLanguage = await client.DetectLanguageAsync(text);
return detectedLanguage.Iso6391Name.Replace('_', '-');
}
}
Use Azure AI Translator APIs
Much like the Language service, Azure AI Translator requires an Azure subscription. You need to create a Translator resource in the Azure portal (or use a multi-service resource).
Install the Azure.AI.Translation.Text NuGet package to your project and authenticate TextTranslationClient
.
The following code snippet sends text/target language name (in the "ISO 693-1" format) and obtains translated content as a response.
public class AzureAITranslationHelper
{
TextTranslationClient client;
internal AzureAITranslationHelper(string key, Uri endpoint, string region = "global")
{
AzureKeyCredential azureCredential = new AzureKeyCredential(key);
client = new TextTranslationClient(azureCredential, endpoint, region);
}
internal async Task<string> TranslateText(string text, string sourceLanguage, string targetLanguage)
{
Response<IReadOnlyList<TranslatedTextItem>> response = await client.TranslateAsync(targetLanguage, text, sourceLanguage);
TranslatedTextItem translatedTextItem = response.Value.First();
return translatedTextItem.Translations[0].Text;
}
}
Implement Word Processing Document API Endpoint
To use the APIs above within your DevExpress-powered Word Processing Document API application, load your document in a RichEditDocumentServer instance and iterate through the Document.Paragraphs collection. Obtain paragraph text using the Document.GetText method and access paragraph character properties using the Document.BeginUpdateCharacters method. Check both paragraph text and character settings - if paragraph text isn't empty and language settings (CharacterProperties.Language) are not specified, call the AzureAILanguageHelper.DetectTextLanguage
method to detect paragraph language. Once complete, assign the detected language as a CultureInfo
object to the CharacterProperties.Language property and finalize paragraph editing using the Document.EndUpdateCharacters method.
If detected language differs from the default document language (in the current example, we assume that the default language is English), use the AzureAITranslationHelper.TranslateText
method to translate paragraph text to the desired language. At this point, you can add a comment to the current paragraph and insert the translated text in that comment.
public async Task<IActionResult> GenerateLanguageSettingsForParagraphs
(IFormFile documentWithHyperlinks,
[FromQuery] RichEditFormat outputFormat) {
try {
var languageHelper =
new AzureAILanguageHelper(languageAzureKey, languageEndPoint);
var translationHelper =
new AzureAITranslationHelper(translationAzureKey, translationEndPoint);
using (var server = new RichEditDocumentServer())
{
await RichEditHelper.LoadFile(server, documentWithHyperlinks);
server.IterateSubDocuments(async (document) =>
{
foreach (var paragraph in document.Paragraphs)
{
CharacterProperties cp =
document.BeginUpdateCharacters(paragraph.Range);
string paragraphText =
document.GetText(paragraph.Range);
if (cp.Language.Value.Latin ==
null && !string.IsNullOrWhiteSpace(paragraphText))
{
CultureInfo? culture = null;
string language =
languageHelper.DetectTextLanguage(paragraphText).Result;
try { culture = new CultureInfo((language)); }
catch { }
finally
{
if (culture != null)
{
// Set the paragraph language
cp.Language =
new DevExpress.XtraRichEdit.Model.LangInfo(culture, null, null);
if (language != "en")
{
// Generate an accessible comment with the paragraph translation
Comment comment =
document.Comments.Create(paragraph.Range, "Translator");
SubDocument commentDoc = comment.BeginUpdate();
string translatedText =
translationHelper.TranslateText(paragraphText, language, "en").Result;
commentDoc.InsertText(commentDoc.Range.Start,
$"Detected Language: {language}\r\nTranslation (en): {translatedText}");
comment.EndUpdate(commentDoc);
}
}
}
}
document.EndUpdateCharacters(cp);
}
});
Stream result =
RichEditHelper.SaveDocument(server, outputFormat);
string contentType =
RichEditHelper.GetContentType(outputFormat);
string outputStringFormat =
outputFormat.ToString().ToLower();
return File(result, contentType, $"result.{outputStringFormat}");
}
}
catch (Exception e)
{
return StatusCode(500, e.Message + Environment.NewLine + e.StackTrace);
}
}
Check Output
The output Word file will include language settings (available for review within the Language dialog) for each non-empty document paragraph and comments with corresponding translations for each non-English text paragraph.
Your Feedback Matters
As always, your feedback is very important. Please let us know whether additional AI-related samples/solutions are of interest to you and how you expect AI to change your development strategies in the next 12-months.