How Venture Capital and Angel Investors Should Evaluate AI Technology
Steven Shwartz

In 2017, Price Waterhouse Coopers analyzed 300 AI use cases and estimated that AI would contribute USD $15.7 trillion to the global economy in 2030 [1].  For comparison, the World Bank estimated 2017 global GDP at just under $81 trillion.

No wonder that AI is a major target of investment by public company investors, private equity investors, venture capitalists, and angel investors.  According to the Stanford Hundred Year AI Survey, investment in AI startups increased from $1.3B in 2010 to $40.4B in 2018 globally [2].

As an angel investor, I evaluate many AI-based startups every month.  In fact, most software technology startups pitch themselves as AI-based.  I am also a serial entrepreneur with multiple AI startups in my background including one that had an IPO and another that became a leading business intelligence product.

Types of AI Companies

Application Vendors:  Most AI products will be applications where AI technology makes the application possible.  For example, products that listen to social media for customer feedback and products that analyze medical images require AI technology.

Tool and Platform Vendors:  These AI products are designed to be used by developers to create applications.  A facial recognition application like Clarifai is an example of a tool that can be used by law enforcement agencies and many other types of organizations to build applications requiring facial recognition.  Platforms offer a wide range of tools.  Google, Microsoft, Amazon, IBM, and many other vendors offer AI platforms including open-source vendors like KNIME, RapidMinder, and

The tools and platforms space is very crowded.  The platform vendors are well-established and it will be an uphill battle for new products.  There may still be niches for tools that are underserved; however, there are major players in the biggest niches such as social listening and facial recognition.

Service Vendors:  These are AI offerings provided by service companies.  For example, most of the global IT consulting companies have AI-based service offerings.  This is also an area that is densely populated with major players.  It will be difficult for newcomers to break in.

In this article, I will focus on evaluating AI-based applications.

Evaluating AI-Based Applications

Most of your evaluation of the company should use the same criteria that you use for non-AI investments, i.e. the management team, product, market, sales plan, competition, and financials.  For example:

  • Value Proposition:  Does the application increase revenue, reduce costs, or reduce risks for its customers?  Is there a hard ROI?
  • Competition:  Does it disrupt an existing space with new capabilities or lower costs?  Is the value significant enough to displace entrenched competition?  Is there a competitive barrier to entry?  Is it an area that is attractive to technology giants like Google and Amazon?
  • Product Development Status:  Is there a v1 in production?  Are there paying customers?
  • Market:  What is the addressable market for the product and the timeline for expected penetration?  What is the buyer persona?  How does pricing work?
  • Marketing:  How will leads be acquired?  Inbound?  Outbound?  Trade shows?  Founders knocking on doors?
  • Sales:  What is the sales plan?  Inside or outside sales?  What is the sales cycle?
  • Financials:  Does the financial plan make sense?  What are the anticipated capital requirements?
  • Exit Strategy:  Who are the potential buyers?
  • Cap Table:  What is the ownership structure?  Is the option pool sufficient?
  • Raise:  Terms and timing?  Use of funds?  Will prior investors participate?

This can be challenging because AI-based companies often have some really smart technologists.  However, try not to be overly impressed with the team’s technical backgrounds, their use of words you do not understand, or how smart they seem.  Focus first on the business just like you would for a non-AI company.  During this process, identify the professed value of the AI technology.  Does it make the application possible?  Do the founders claim it gives them a competitive moat?

AI-based applications typically use either machine learning or natural language processing and each has different evaluation criteria.

Machine learning algorithms take a table of data as input.  Supervised learning algorithms have both input and output columns in the table.  For example, to create a supervised learning system that predicts house sale prices, one would create a training table with one row for each historical house sale.  The input columns would be “square footage”, “number of bathrooms”, and other relevant factors.  The output column would contain the actual historical sale price of the house.  By learning how to predict the output column value in each row, the system can learn to roughly predict the selling prices of houses on the market.  To create a reasonably accurate system (like Zillow), one would need hundreds or thousands of input columns and of rows of actual house sales.

Similarly, one could use supervised learning to recognize faces.  A facial recognition system table would typically have one row per facial image.  The input columns would be the individual pixels of the image – one pixel per column.  The output column would be the name of the person in the image.  The algorithm would learn to predict the output column value (the name) from the input columns.  The learned algorithm will only be able to recognize people in the training table but it will be able to recognize the faces of those people that it had not seen during training.

Supervised learning is by far the most frequently used type of machine learning but unsupervised learning and reinforcement learning are also used. Combinations of supervised and unsupervised learning are also common and some applications use all three (e.g. self-driving cars).

This article focuses on evaluating supervised learning technology.  Future articles will discuss evaluating other types of AI technology.

Evaluating Supervised Learning Technology

Here are nine questions you should ask the supervised learning technology team:

(1) Is the data proprietary or culled from public data sources?

Proprietary data can represent a significant competitive moat. Public data is available to anyone.

If it is public data, does the team gain a competitive advantage by a unique method of pulling together multiple public data sources? For example, there might be a combination of public data that took years to figure out.  Or, the team might have performed preprocessing transformations on the data that would take time for a competitor to duplicate.  Or, the team might have developed a unique technique to augment the data which, if non-obvious, could represent a significant competitive moat.  If public data is used, an estimate needs to be made of the cost and time required for a competitor to produce a competitive training table.

Are there gaps in the data?  What is the quality of the data labels?

How much effort must be put into adapting the dataset for each new customer.  Do interfaces need to be written?  Does data need to be cleaned and transformed?  If too much effort is required, the application becomes more of a service than a software offering which will may result in a much lower exit multiple.

Finally, how much effort is required to maintain the dataset and keep it up to date?

(2) Does the training dataset exist?

This sounds like an unnecessary question.  If the company is using supervised learning, then the training dataset must exist, right?  Surprisingly, no.  I run across many startups that are planning to create a large training dataset once they have a large number of customers.  When I discover this circumstance, I evaluate the company as if it did not have any AI technology?  Why?  It will not be able to rely on the AI technology to acquire that large set of initial customers.

If the table exists, ask how many rows it has.  Many algorithms, especially deep neural networks, need millions of rows of data.  Classical supervised learning algorithms often require far fewer rows.  With too few rows, the learned function will be unreliable.

(3) What are the input variables in the training dataset?

How representative are the input variables of the real-world data?  For example, a dataset that predict housing sales prices that does not contain a waterfront variable may be predictive for areas with no waterfront but will miss the mark for areas near the water.

(4) What supervised learning method is used?

The main difference between predicting house sale prices and recognizing faces is that the former uses classical machine learning to compute the prediction function. For decades, this type of supervised learning was known as statistics until it gained prominence from being folded into the AI rubric.  The facial recognition system uses a modern deep learning algorithm.  Both classical and modern supervised learning compute functions that predict the value of the output column.  Importantly, deep learning networks have not superseded classical methods.

In many cases, model methods outperform classical methods, especially for very large training tables.  Modern methods also have the advantage of automatic creation of features and users

In fact, classical methods often produce superior results in terms of performance, cost, and interpretability.  Be on the lookout for teams that assume deep learning is the answer to every problem.  They may be vulnerable to teams that use less expensive and more explainable classical methods.

There are many supervised learning methods.  Each carries assumptions about the underlying data.  There

Better statistical algorithms – larger datasets, automatic feature creation, no need to decide on parametric assumptions

You should have a data scientist evaluate the team’s choice(s) and make sure they are reasonable.

(5) How was generalization tested?

Supervised learning systems almost always have higher predictive accuracy in the training environment than in the real world.  This is because the rows in the training table only represent a sample of the rows that might be encountered in the real world.  If there are any data patterns in the real world that are not present in the training table, the system will be less predictive in the real world than it is in the training environment.  Similarly, if there are any data patterns in the training table that are not present in the real world, the same thing will happen.

There are methods for estimating how well a system will generalize from the training environment to the real world.  Have a data scientist determine if the estimate of real-world predictive capacity makes sense.

(6) What toolset is being used?

You will find some teams that create their own tools.  This is expensive and often unjustified.  You will also find teams that use expensive proprietary tools.  This is also usually not justified as there are good open-source libraries such as scikit-learn.  Finally, ask the team to explain the level of tools they use.  There are good image recognition libraries out there that will suffice for or at least reduce costs.  These are often more appropriate than starting from scratch with lower-level tools such as convolutional neural networks.

(7) Is the technology team in-house or is the technology development outsourced?

I do not like outsourced technology.  It is very expensive to create.  Most teams with outsourced technology recognize they need to bring the technology in-house.

You need to recognize that moving the technology in-house has many risks.  First, the company needs to hire the right team.  Second, there will be opportunity risk while the transition is occurring.  The new development team will need time to learn the technology.  This situation is usually a deal-breaker for me.

(8) How difficult will it be to scale the technology team?

Machine learning personnel can be very expensive and hard to find.  The most difficult case is one in which the number of personnel with machine learning expertise must grow linearly with revenues.  Of course, this is probably indicative of a service rather than a software business.  The best case is an application that only has a small core of machine learning that can be handled by a small team that will require little growth in proportion to revenues.

(9) Has the technology been examined for discrimination?

Many machine learning applications have been criticized recently for being inadvertently discriminatory.  For example, machine learning systems that predict criminal recidivism have been found to discriminate against African Americans.  Similarly, facial recognition systems often perform poorly for minority groups.  This put them at risk of being falsely identified as a terrorist or criminal.  You do not want to invest in a company only to find that the technology develops a negative reputation for being discriminatory.

(10) Are the application’s predictions explainable?

Machine learning systems that make important decisions such as loan decisions have come under fire for not being explainable.  The European Union now requires all service providers who use automated decision systems that impact people’s lives to explain the reason for the system’s decision.  Some machine learning algorithms are explainable (e.g. decision trees) and others are difficult to explain (e.g. deep learning).  It would be a mistake to invest in a company that uses non-explainable machine learning technology to make decisions that impact people’s lives.


The first step in investing in AI-based companies is the same as for non-AI companies.  If the company passes muster on these criteria, there should be a deep dive into the technology.  In this article, I’ve explained how to evaluate machine learning technology.  In a future article, I will discuss natural language processing technology.