Model Training Suite
Rosette Model Training Suite
Rosette Model Training Suite is a set of tools for annotating data and training models for Rosette entity and event extraction. The annotation tool uses active learning to guide the annotation process, providing suggestions and choosing samples that will ensure the model converges as rapidly as possible towards the highest quality results. As data is annotated, the model is trained.
Once trained, the models are uploaded into your production instance of Analytics Server to perform custom entity and event extraction.
Features of Rosette Model Training Suite include:
Reduced training data requirements
Optimized annotator and project manager experiences
Modular templates supporting different types of projects
Integration with the Rosette linguistic framework
A robust data store capable of managing multiple simultaneous multi-user annotation efforts
Display and search features providing both high-level and deep-dive views of each project’s progress
Accuracy metrics
Automatic model training
Trained custom models for deployment in production installations of Analytics Server
Supported languages
The following languages are supported by Model Training Suite for model training and extraction.
Documentation
The complete Model Training Suite documentation set includes the following guides:
System Administrator Guide
A guide for installing and maintaining both the training and production environments of the Model Training Suite. Included are instructions for moving trained models from the training environment into the production environment, as well as the documentation for the API calls for entity and event extraction.
Developing Models
A guide for the system architects and model administrators to aid in defining the modeling strategy and understanding the theory of model training. It includes an explanation of event modeling and how to design an event schema in preparation for training event extraction models, as well as guidelines for gathering and preparing data for model training.
Adaptation Studio User Guide
A guide for the managers, adjudicators, and annotators using Adaptation Studio describing how to use the tool to create and maintain projects, annotate and train entity and event extraction models, and create event schemas.
Model training architecture
A complete Model Training Suite installation includes the following major components. All installations must include Adaptation Studio and Analytics Server. An installation may include one or both of the training servers: Entity Training Server and Event Training Server.
Adaptation Studio: Provides annotation and project management features, as well as user and role management and the project database.
Analytics Server: Analytics Server is an on-premises package that provides access to the text analytics endpoints. Your license determines which endpoints and languages are active in your installation. The entities endpoint is part of the Entity Extractor which is deployed through Analytics Server.
The suggestions provided for annotation labels are generated by the entities and morphology endpoints.
The models trained to perform entity extraction are consumed by the entities endpoint.
The models trained to perform event extraction are consumed by the events endpoint.
Entity Training Server: Trains entity extraction models and stores the models while training.
Event Training Server: Trains event extraction models and stores event models for training and event extraction in production.
Indocument Coreference Server: This optional server chains together all mentions to a named entity, including pronouns, job functions, and different forms of a named entity.

Required Analytics endpoints
Model Training Suite uses features of Analytics Server to prepare input text and identify candidates for annotations. The following endpoints must be installed and licensed in your installation of Analytics Server for training and extraction.
Endpoint | NER Training | Event Training | Event Extraction |
---|---|---|---|
/entities | ✓ | ✓ | ✓ |
/events | ✓ | ||
/language | ✓ | ✓ | ✓ |
/morphology | ✓ | ✓ | |
/semantics | ✓ | ✓ | |
/sentences | ✓ | ✓ | ✓ |
/tokens | ✓ | ✓ | ✓ |
/info | ✓ | ✓ | ✓ |
/ping | ✓ | ✓ | ✓ |
Deployment options
The components of the Model Training Suite are used for two different purposes, training and production.
Training: Annotation and training of entity and event models. The training environment includes:
Adaptation Studio
Analytics Server
Entity Training Server
Event Training Server
Production: Using previously-trained models to perform entity and event extraction. The production environment includes:
Analytics Server
Event Training Server
The training and production environments can use the same instance of Analytics Server or the two environments can be completely separate. You determine how many physical machines are required based on the size of your models and your organization's requirements. The following diagram shows two possible implementations.


Security recommendations
When deploying the Model Training Suite, you must secure both the training environment and the deployed models. There are multiple ways to secure a model training and deployment environment.
Control who has access to the training system and prevent malicious actors from logging into the system.
Assign user access through the user management facility of Adaptation Studio. Control the level of access each user has to the system and limit users to the access they require to complete their work. For example, annotators can only annotate the documents assigned to them, limiting the impact they can have on the models.
Use the project management reports to reduce risks from insider threats as well as risks from non-malicious human errors. By ensuring that multiple users are training models in a similar and consistent way, an administrator can ensure that malicious actors are not corrupting the model creation process.
Control who has access to model files to prevent model files being altered by malicious parties after they are exported from MTS and deployed to Analytics Server.