From startups to legacy brands, you're making your mark. We're here to help.
Key Links
Prepare for future growth with customized loan services, succession planning and capital for business equipment.
Key Links
Serving the world's largest corporate clients and institutional investors, we support the entire investment cycle with market-leading research, analytics, execution and investor services.
Key Links
Providing investment banking solutions, including mergers and acquisitions, capital raising and risk management, for a broad range of corporations, institutions and governments.
Your partner for commerce, receivables, cross-currency, working capital, blockchain, liquidity and more.
Key Links
A uniquely elevated private banking experience shaped around you.
Whether you want to invest on your own or work with an advisor to design a personalized investment strategy, we have opportunities for every investor.
For Companies and Institutions
From startups to legacy brands, you're making your mark. We're here to help.
Serving the world's largest corporate clients and institutional investors, we support the entire investment cycle with market-leading research, analytics, execution and investor services.
Your partner for commerce, receivables, cross-currency, working capital, blockchain, liquidity and more.
Prepare for future growth with customized loan services, succession planning and capital for business equipment.
Providing investment banking solutions, including mergers and acquisitions, capital raising and risk management, for a broad range of corporations, institutions and governments.
For Individuals
A uniquely elevated private banking experience shaped around you.
Whether you want to invest on you own or work with an advisor to design a personalized investment strategy, we have opportunities for every investor.
Explore a variety of insights.
Key Links
Insights by Topic
Explore a variety of insights organized by different topics.
Key Links
Insights by Type
Explore a variety of insights organized by different types of content and media.
Key Links
We aim to be the most respected financial services firm in the world, serving corporations and individuals in more than 100 countries.
Key Links
A key barrier for companies to adopt machine learning is not lack of data but lack of labeled data. Labeling data gets expensive, and the difficulties of sharing and managing large datasets for model development make it a struggle to get machine learning projects off the ground.
That’s where our “learn more from less data” approach comes into action. At JPMorgan Chase, we are focused on reducing the need for data to build models. Instead, we focus on building gold training datasets, helping reduce the labeling cost and increasing the agility of model development.
Labeled data is a group of samples that have been tagged with one or more labels. After obtaining a labeled dataset, machine learning models can be applied to the data so that new, unlabeled data can be presented to the model and a likely label can be guessed or predicted for that piece of unlabeled data. A gold training dataset is a small, labeled dataset with high predictive power.
Active learning is a form of semi-supervised learning, which works well when you have a lot of data but face the expense of getting that data labeled. By labeling data points that help the quality of the model, teams can identify the samples that are most informative.
Using machine learning (ML) models, active learning can help identify difficult data points and ask a human annotator to focus on labeling them.
To explain passive learning and active learning, let’s use the analogy of teacher and student. In the passive learning approach, a student learns by listening to the teacher's lecture. In active learning, the teacher describes concepts, students ask questions, and the teacher spends more time explaining the concepts that are difficult for a student to understand. Student and teacher interact and collaborate in the learning process.
In ML model development using active learning, annotator and modeler interact and collaborate. An annotator provides a small labeled dataset. The modeling team builds a model and generates input on what to label next. Within a few iterations, teams can build refined requirements, a labeled gold training set, active learner and working machine learning model.
To identify difficult data points, we use a combination of methods, including:
Classification uncertainty sampling: When querying for labels, the strategy selects the sample with the highest uncertainty — data points the model knows least about. Labeling these data points makes the ML model more knowledgeable.
Margin uncertainty: When querying for labels, the strategy selects the sample with the smallest margin. These are data points the model knows about but isn’t confident enough to make good classifications. Labeling these examples increase model accuracy.
Entropy sampling: Entropy is a measure of uncertainty. It is proportional to the average number of guesses one has to make to find the true class. In this approach, we pick the samples with the highest entropy.
Disagreement-based sampling: While using this method, we pick those samples where different algorithms disagree. Example: If model is classifying into 5 classes (A,B, C, D & E), and if we are using 5 different classifiers, e.g.
Bag of words
LSTM
CNN
BERT
HAN (Hierarchical Attention Networks)
Annotator can label examples on which classifiers disagree.
Information density: In this approach, we focus on a denser region of data and select few points in each dense region. Labeling these data points help the model classify large number of data points around these points.
Business value: In this method, we focus on labeling the data points that have higher business value than the others.
Traditionally, data scientists work with annotators to label a portion of their data and hope for the best when training their model. If the model wasn’t sufficiently predictive, more data would be labeled, and they would try again until its performance reached an acceptable level. While this approach still makes sense for some problems, for those that have vast amounts of data or unstructured data, we find that active learning is a better solution.
Active learning combines the power of machine learning with human annotators to select the next best data points to label. This intelligent selection leads to the creation of high-performance models in less time and at lower cost.
The Artificial Intelligence & Machine Learning group is focused on increasing the volume and velocity of AI applications across the firm by helping develop common platforms, reusable services and solutions.
You're now leaving J.P. Morgan
J.P. Morgan’s website and/or mobile terms, privacy and security policies don’t apply to the site or app you're about to visit. Please review its terms, privacy and security policies to see how they apply to you. J.P. Morgan isn’t responsible for (and doesn’t provide) any products, services or content at this third-party site or app, except for products and services that explicitly carry the J.P. Morgan name.