ibm-watson-cognitive AlchemyLanguage


AlchemyLanguage is a collection of text analysis methods that provide deeper insight into your text or HTML content. See the Getting Started topic to learn how to get started with AlchemyLanguage and other Watson services. For more AlchemyLanguage details and examples, see the API reference and documentation.

Size limits

  • HTML content before text cleaning: 600 KB
  • Source text, after text cleaning: 50 KB
  • Calls that use Custom Models: 5 KB

Language support

To see which languages are supported for each function, refer to each function's entry in the API reference.

Language detection

By default, AlchemyLanguage automatically detects the language of your source text. You can manually specify the language of your content with the language query parameter. (e.g. language=spanish)

Text cleaning

When you use an HTML or URL function of the API, AlchemyLanguage cleans the content to prepare the source text for the analysis. The sourceText parameter allows you to customize the cleaning process with the following options:

  • cleaned_or_raw (default) -- Removes website elements such as links, ads, etc. If cleaning fails, raw web page text is used
  • cleaned-- Removes website elements such as links, ads, etc.
  • raw -- Uses raw web page text with no cleaning
  • cquery -- Uses the visual constraints query that you specify in the cquery parameter. See the documentation for details about visual constraints queries.
  • xpath -- Uses the XPath query that you specify in the xpath parameter
  • xpath_or_raw -- Uses the results of an XPath query, falling back to plain text if the XPath query returns nothing
  • cleaned_and_xpath -- Uses the results of an XPath query on cleaned web page text