Format
Web-based training
Duration
2.5 hours
Intended Audience
Compliance support professionals. Business users who are responsible for managing the organization's regulatory compliance process.
Prerequisites
Basic computer literacy skills. An understanding of Surveillance requirements would also be helpful.
Overview
This web-based course provides a comprehensive introduction to Cognition Studio, with a focus on Data Science tasks. Attendees will gain a good understanding of the Cognition Studio environment. They will learn how to develop and implement models to flag potential violations to help meet their surveillance requirements.
Course Topics
- Introduction to Conduct Surveillance and the compliance use case
- Introduction to the "scenario-based" approach
- Introduction to machine learning modules
- Machine learning modules for text classification
- Supervised machine learning
- Predictions and confidence score
- Confidence threshold
- Quality metrics for text classifiers
- Precision/recall/F1 score
- Confidence threshold
- PR Curve
- Cross-fold validation
- Creating a model in Cognition Studio UI
- Creating a new model project
- Creating a new label set
- How to choose training data
- Binary versus non-binary classifiers
- Classifier spans; sentence-level, document-level, and "other"
- Bootstrapping with examples and keywords
- Labeling data
- Samplers: Keyword, random, search, top-predictions, random predictions, highest entropy
- Tagging guidelines for ambiguous and confounding samples
- "Cross-set" tagging option
- Iterating on the tag/train/predict loop
- Annotation Consistency
- Importance of annotation consistency
- Recommendations for creating annotation guideline documentation
- Model Evaluation in the Cognition Studio UI
- Evaluation on unlabeled datasets
- Evaluation on labeled datasets
- (OPTIONAL) Model behavior "under-the-hood"
- Data normalization
- Text features
- Pre-trained word vectors