We’ve recently published a new version of our Text Classification API, which comes hand in hand with a new version of the Classification Models Customization console.
In both these new versions, the main focus is on user models. We know how important it is to easily define the exact criteria you need, so the new classification API supports a new type of resource, the one generated by the Classification Model Customization Console 2.0.
In this post, we will talk about how to migrate to these new versions if you are currently using the old ones. Text Classification 1.1 and Classification Models 1.0 will be retired on 15/Sep/2020.
There are three possible migration scenarios:
- You are using any of the predefined models except for IAB.
- You only have to migrate the API, as described below.
- You are using the IAB model.
- You will need to migrate your API connection to Deep Categorization to use IAB 2.0.
- You will need to adapt IAB to IAB 2.0.
- You are using a user-defined model.
- You will need to migrate the API, as described below.
- You will need to adapt your classification model to a model in the new version.
Migrating the API: from Text Classification 1.1 to Text Classification 2.0
Migrating the integration with an API has two parts: adapting the request and the response the API returns. In both cases, the changes are minimal, so the migration should not be very costly.
The following table contains the most relevant changes in the request:
Text Classification 1.1 | Text Classification 2.0 | |
---|---|---|
Endpoint | https://api.meaningcloud.com/class-1.1 | https://api.meaningcloud.com/class-2.0 |
Parameter debug |
Did not exist. | When enabled, it shows additional debug information about the rules in the model that have been triggered. It only applies to user-defined models. |
Parameter abstract |
Did not exist. | Descriptive abstract of the content. The terms relevant for the classification process found in the abstract will have more influence in the classification than if they were in the text (but less than the ones in the title). |
Parameter categories |
Renamed to categories_filter , same behavior. |
|
Parameter expand_hiearchy |
Did not exist. | It allows you to select if in the results you want to include the parents or the ancestors of the category/categories in which the content has been classified. It only applies to models with explicit hierarchy. By default it shows no ancestors, which is the same behavior as in version 1.1. |
All the other parameters from Text Classification 1.1 not explicitly mentioned, behave exactly the same in Text Classification 2.0.
The response does not change much either. The main two changes come from the new parameters enabled:
- When
debug
is enabled, a new element will appear at the output, debug, with a list of rules triggered in the text classification with two values: the rule and the weight they add. - Each term included in
term_list
, now also contains a field calledabs_frequency
with the frequency of the term in the text classified.
Easy peasy! You can read all the documentation for the new API here.
Migrating the API when using IAB: from Text Classification 1.1 to Deep Categorization 1.0
The IAB model is no longer going to be included as a predefined model in the Text Classification API. Instead, an improved version of this model, IAB 2.0, is provided with the Deep Categorization API. This migration is only needed if you are currently using the IAB model and wish to keep doing so.
Again, migrating the integration with an API has two parts: adapting the request and the response the API returns. Let’s see the changes.
The following table contains the most relevant changes in the request:
Text Classification 1.1 | Deep Categorization 1.0 | |
---|---|---|
Endpoint | https://api.meaningcloud.com/class-1.1 | https://api.meaningcloud.com/deepcategorization-1.0 |
Parameter of |
Values supported: json/xml | Values supported: json |
Parameter title |
Text sent as title to the classification. | Does not exist. Should be included with the rest of the content to analyze. |
Parameter categories |
Does not exist. |
All the other parameters from Text Classification 1.1 not explicitly mentioned, behave exactly the same in Deep Categorization 1.0.
The response does not change much either. The only change is in the term_list
field:
Text Classification 1.1 (IAB) | Deep Categorization 1.0 (IAB 2.0) | |
---|---|---|
category_list: [ { code: "Food&Drink>DiningOut", label: "Food & Drink>Dining Out", abs_relevance: "2", relevance: "100", term_list: [ { form: "restaurant", abs_relevance: "2" } ] } |
category_list: [ { code: "Food&Drink>DiningOut", label: "Food and Drink>Dining Out", abs_relevance: "2", relevance: "100", term_list: [ { form: "restaurant", abs_relevance: "2", offset_list: [ { inip: "19", endp: "28" } ] } ] } |
You can read all the documentation for the Deep Categorization API here, and read about the differences in the new IAB version here.
Heads up!
Pay close attention to how credits are counted for Deep Categorization, as your consumption may increase depending on the length of the texts you are classifying!
Migrating your model to the Classification Model Customization Console 2.0
So what happens if you already have a working model defined in the old customization console? You may be wondering if you have to redefine the whole thing… Don’t worry, we’ve got you!
If you access any of your models, you will see a new button in the Actions section of the sidebar called “Migrate“.
When clicked, this button will show a dialog that will let you launch a process to migrate the model automatically to the new version. The process will create a new model called “[your-model-name]-Migrated” in the customization console 2.0.
This new model will contain the same information as the old one, but any rules defined for it will be translated into the new rule syntax (which is one of the most significant changes). When the process is done, you will be redirected to a report of the migration with all relevant details.
The following image shows the migration report we obtain for the example model we provide:
The migration process does the following:
- Creates a new model in the 2.0 console.
- Updates it with the settings in your current model that apply to the new version (including stopwords). The rest of them are set to the default values, except
lemmatization
, which is disabled to match the behavior of the original model more closely. - Creates the categories that exist in your current model in the migrated model.
- Updates the new categories with the transformed information from the old ones:
- Training text is stored in the new category as it is.
- Rules are translated into the new syntax.
The migration report provides detailed information on how this transformation is made, so you can check out if everything is correct:
The only way to access the report again after leaving it is to redo the process, so we recommend downloading it to a PDF using the button in the bottom right corner.
Once your model has been migrated, you should check that it behaves as expected. As you still will have access to the previous version, you can classify the same collection of texts with both to adjust any differences that may appear. Some of the things you will need to check are:
- Relevance values: the relevance is computed differently in this new version, so you will need to adjust the relevance thresholds defined in the settings. This especially applies to the
minimum absolute relevance
, which is now limited to values between 0 and 1. Anything higher than that will give you a warning in the migration process. - Terms definition:
- Now that lemmatization is supported, it may help simplify your current rules.
- Some of the new operators can help you define more accurate rules.
Regarding the limit of models provided for your plan, it will apply to each one of the consoles separately, enabling you to maintain both versions while you test them. In other words, if you have two models in your plan, you will be able to have two models in each console while both consoles exist.
If you have any questions, issues or just want to say hi, we are always available at support@meaningcloud.com!