Quantcast
Channel: Developer Express Inc.
Viewing all articles
Browse latest Browse all 2370

Office File API — Enhance Accessibility and Language Support in Word Documents using Azure AI Services

$
0
0

In my two previous AI-related blog posts, I explained how you can generate accessible descriptions for images, charts and hyperlinks used within Word and Excel documents using DevExpress Office File API libraries and the Azure AI OpenAI service:

  1. Office File API — Enhance Accessibility in Office Documents using OpenAI Models (Part 1)
  2. Office File API — Enhance Accessibility in Office Documents using OpenAI Models (Part 2)

In this post, I’ll detail another important tip for accessible document generation - detecting and setting the proof language for paragraphs in a multi-language Word document. In addition, I'll describe how to translate multi-language documents and generate accessible comments with paragraph translations to the chosen language.

As you may know, there are several reasons why it’s essential to specify proof language and add translations for text paragraphs when generating accessible Word documents (especially for multi-language documents). Language settings aid screen readers in interpreting and vocalizing text. Accessibility features and tools such as spell/grammar checkers rely on these settings to provide accurate suggestions and corrections. Language settings are also crucial when exporting Word documents to accessible PDF formats. And comments with paragraph translations help add clarity to document content, contribute to document accessiblity, and enhance collaboration in multi-language environments.

To address these particular requirements, we will use the DevExpress Word Processing Document library and two Azure AI services - Language and Translator. Please review the new endpoint for these capabilities in our sample project on GitHub: Office File API – Integrate AI.

Before you incorporate this solution in your app, please be sure to read and understand terms and conditions of use for Azure AI Services.

Implementation Details

Use Azure AI Language APIs

The Azure AI Language service requires an Azure subscription. Once subscribed, create a Language resource in the Azure portal to obtain your key and endpoint.

Note: You can create a multi-service resource to access multiple AI services with the same key and endpoint.

Install the Azure.AI.TextAnalytics NuGet package to your project to use Language service APIs for text language detection. The following code snippet authenticates TextAnalyticsClient, sends text to the service, and the service returns information about the detected language. For this implementation, we only need language name in "ISO 693-1" format.

public class AzureAILanguageHelper
{
    private TextAnalyticsClient client;
    internal AzureAILanguageHelper(string key, Uri endpoint)
    {
        AzureKeyCredential azureCredential = new AzureKeyCredential(key);
        client = new TextAnalyticsClient(endpoint, azureCredential);
    }
    internal async Task<string> DetectTextLanguage(string text)
    {
        DetectedLanguage detectedLanguage = await client.DetectLanguageAsync(text);
        return detectedLanguage.Iso6391Name.Replace('_', '-');
    }
}

Use Azure AI Translator APIs

Much like the Language service, Azure AI Translator requires an Azure subscription. You need to create a Translator resource in the Azure portal (or use a multi-service resource). 

Install the Azure.AI.Translation.Text NuGet package to your project and authenticate TextTranslationClient

Note: If using Azure AI multi-service or regional Translator resource, the "region" parameter must be specified for client authentication purposes.

The following code snippet sends text/target language name (in the "ISO 693-1" format) and obtains translated content as a response.

public class AzureAITranslationHelper
{
    TextTranslationClient client;
    internal AzureAITranslationHelper(string key, Uri endpoint, string region = "global")
    {
        AzureKeyCredential azureCredential = new AzureKeyCredential(key);
        client = new TextTranslationClient(azureCredential, endpoint, region);
    }
    internal async Task<string> TranslateText(string text, string sourceLanguage, string targetLanguage)
    {
        Response<IReadOnlyList<TranslatedTextItem>> response = await client.TranslateAsync(targetLanguage, text, sourceLanguage);
        TranslatedTextItem translatedTextItem = response.Value.First();
        return translatedTextItem.Translations[0].Text;
    }
}

Implement Word Processing Document API Endpoint

To use the APIs above within your DevExpress-powered Word Processing Document API application, load your document in a RichEditDocumentServer instance and iterate through the Document.Paragraphs collection. Obtain paragraph text using the Document.GetText method and access paragraph character properties using the Document.BeginUpdateCharacters method. Check both paragraph text and character settings - if paragraph text isn't empty and language settings (CharacterProperties.Language) are not specified, call the AzureAILanguageHelper.DetectTextLanguage method to detect paragraph language. Once complete, assign the detected language as a CultureInfo object to the CharacterProperties.Language property and finalize paragraph editing using the Document.EndUpdateCharacters method. If detected language differs from the default document language (in the current example, we assume that the default language is English), use the AzureAITranslationHelper.TranslateText method to translate paragraph text to the desired language. At this point, you can add a comment to the current paragraph and insert the translated text in that comment.

public async Task<IActionResult> GenerateLanguageSettingsForParagraphs
    (IFormFile documentWithHyperlinks,
    [FromQuery] RichEditFormat outputFormat) {
  try {
      var languageHelper =
            new AzureAILanguageHelper(languageAzureKey, languageEndPoint);
      var translationHelper =
            new AzureAITranslationHelper(translationAzureKey, translationEndPoint);
      using (var server = new RichEditDocumentServer())
      {
          await RichEditHelper.LoadFile(server, documentWithHyperlinks);
          server.IterateSubDocuments(async (document) =>
          {
              foreach (var paragraph in document.Paragraphs)
              {
                  CharacterProperties cp =
                        document.BeginUpdateCharacters(paragraph.Range);
                  string paragraphText =
                  document.GetText(paragraph.Range);
                  if (cp.Language.Value.Latin ==
                      null && !string.IsNullOrWhiteSpace(paragraphText))
                  {
                      CultureInfo? culture = null;
                      string language =
                        languageHelper.DetectTextLanguage(paragraphText).Result;
                      try { culture = new CultureInfo((language)); }
                      catch { }
                      finally
                      {
                          if (culture != null)
                          {
                              // Set the paragraph language
                              cp.Language =
                                    new DevExpress.XtraRichEdit.Model.LangInfo(culture, null, null);
                              if (language != "en")
                              {
                                  // Generate an accessible comment with the paragraph translation
                                  Comment comment =
                                        document.Comments.Create(paragraph.Range, "Translator");
                                  SubDocument commentDoc = comment.BeginUpdate();
                                  string translatedText =
                                      translationHelper.TranslateText(paragraphText, language, "en").Result;
                                  commentDoc.InsertText(commentDoc.Range.Start,
                                        $"Detected Language: {language}\r\nTranslation (en): {translatedText}");
                                  comment.EndUpdate(commentDoc);
                              }
                          }
                      }
                  }
                   document.EndUpdateCharacters(cp);
              }
          });
          Stream result =
               RichEditHelper.SaveDocument(server, outputFormat);
          string contentType =
                RichEditHelper.GetContentType(outputFormat);
          string outputStringFormat =
                outputFormat.ToString().ToLower();
          return File(result, contentType, $"result.{outputStringFormat}");
      }
  }
  catch (Exception e)
  {
      return StatusCode(500, e.Message + Environment.NewLine + e.StackTrace);
  }
}

Check Output

The output Word file will include language settings (available for review within the Language dialog) for each non-empty document paragraph and comments with corresponding translations for each non-English text paragraph.

Your Feedback Matters

As always, your feedback is very important. Please let us know whether additional AI-related samples/solutions are of interest to you and how you expect AI to change your development strategies in the next 12-months. 


Viewing all articles
Browse latest Browse all 2370

Trending Articles