Building Data Science Teams

July 24, 2020

Despite concerns of killer robots taking over the world, research shows that adoption of Artificial Intelligence (AI) across the industry is accelerating at a rapid pace. In 2019, McKinsey reported that AI adoption has increased in most industries, with 58 percent of organizations participating in the survey reporting that they have implemented at least one AI capability, up from 47 percent in 2018. Given this trend, traditional software organizations need to be rewired in order to build high-performing data science teams that can drive AI innovation.

Rewiring the Organization

Traditional software organizations are hard-wired to build software applications that are usually driven by efficiency. Most “smart” applications built in this century have translated business processes into standardized and deterministic rules that can be automated by code. By adopting AI, rules-based applications will now be able to process knowledge and data and generate valuable insights for a business.

Realizing AI’s full potential value goes above and beyond adopting a new technology. It requires shifting mindsets across departments in the entire organization, starting at the very top. Kathryn Hume did a great job of explaining it in a recent AI at Work Podcast: “The opportunity in running an AI-first organization is to shift the mindset around what a business process is — from a vehicle to drive standardization and efficiency, to a vehicle to collect unique knowledge about the world that can only be known via the processes that you have and the customers that you have.” Transforming efficiency- driven companies into knowledge-driven companies will undoubtedly change the way every department works. For many companies, one of the first steps in this transformation is building a data science team.

The First Recruits

Incorporating statistical analysis and applied mathematical models into business operations and products is an age-old practice. Financial services and insurance companies have been employing quantitative analysts for decades. Companies in other industries that are jumping on the AI bandwagon are following suit by hiring a lead data scientist to spearhead innovation efforts.

When building data science teams, the first recruit will likely have a couple of years of experience in the industry and a formal Master’s or Ph.D. degree in computer science, engineering, mathematics, physics, data science or machine learning. Although it is very common for organizations to hire experienced data scientists, Andrew Ng, founder of deeplearning.ai, suggests in a recent interview with Forbes, that hiring a couple of engineers to work on a small project is a good way to get the wheels turning and get a feel for what AI can do before defining a strategy for the organization.

However, not even the most talented data scientist will be able to get the job done alone. Different from the traditional quantitative analysts (quants) who have worked in banks and insurance companies, moving machine learning models into existing software applications requires close collaboration between people with a very diverse set of skills. The importance of people with industry, domain, and business knowledge is often underemphasized. Domain and business experts have a critical role in selecting the right questions to ask, identifying the untapped knowledge that would create the most impact, and interpreting the insights extracted from AI models.

The Economist defines the new craftmanship of data scientists as “the combination of the skills of software programmer, statistician and storyteller/artist to extract the nuggets of gold hidden under mountains of data.” To get the job done, data scientists will need a clear direction on the “nuggets of gold” and how to find answers to questions they are searching for.

Artificial Intelligence Team Roles

Though currently many companies struggle to hire data scientists, building an AI-first company or an AI product requires teamwork and a more diversified set of skills. The key roles in high performing AI teams which are summarized by this Forbes article include the following:

  • AI project manager: The project manager coordinates projects across all job functions and helps a company identify the business challenge it is hoping to solve through AI. They are the link between business leaders and the data science team.
  • Data engineer: The data engineer is responsible for gathering the data and supplementing it with external data if needed. They have specialized business knowledge as well as technical expertise with data models. They analyze and manipulate data so that it is in an optimal clean and normalized state and can be fed to AI models.
  • Data scientist: Data scientists are generally mostly focused in academic research, forming hypothesis, & testing statistical models and algorithms. Ph.D candidates focus on the mathematical theory rather than the applications in industry.
  • Machine learning engineer: Machine Learning Engineers- ML Engineers integrate mathematical models into live industry applications and products. They combine an expertise on AI models with experience on designing and programming applications and architectures that scale.
  • AI ethicist: The role of AI ethicist is still evolving. It emerged to meet the need of ensuring that AI models address any underlying bias in the data and enforce fairness throughout the AI lifecycle.

While there is general agreement that AI teams must have representation of the roles described above, there is no consensus of the right proportions for each role. In the beginning, it is likely that one person may fulfill multiple roles. As the organization grows and evolves in AI adoption, so will the roles within the team. Some organizations that start with a team of three data scientists, may end up with a team of two data scientists and three machine learning engineers. Likewise, other organizations that rely heavily on engineers may evolve to hire more data scientists. The right proportions can vary greatly, and each organization will adjust and evolve to find its own unique path.

Hiring vs Training Your Own

Here at Wovenware we’ve built an AI team from the ground up, hiring experienced workers and training new hires and experienced software developers. There is some hype in the industry about hiring only Ph.D level data scientists. They are often treated as unicorns. But it’s important to nurture a a keen sense for identifying talent with the right quantitative and analytical skills. Those who are hungry for solving complex problems will jump at the opportunity to up-skill. After going through both academic and on-the-job training Wovenware trainees learn how to read research papers, conduct experiments, build AI models and communicate insights with effective storytelling techniques. They are adequately equipped with the knowledge and tools to solve AI problems.

Creating both broad and specialized training programs is a wise move for any organization looking to build internal data science teams. Experience shows that resources having formal academic training do not always have an advantage over resources that learn using a combination of online programs and on- the- job training. For example, in an earlier post we’ve outlined some of the courses Wovenware software developers in training have taken to become machine learning engineers. A strong internal team will combine formally trained new hires with internal talent that has deep knowledge about your organization and applications.

Data Science Team Structure

Companies are implementing three main business models to integrate data science teams with the rest of the organization:

  • Centralized: Projects across the organization are handed off to a central data science team.
  • Decentralized: Business units across the organization are assigned dedicated data scientists and engineers.
  • Hybrid: Business units have designated AI experts who work with them on a daily basis but they also report to and rely on a centralized data science team to execute projects.

Each business model varies in the way and the frequency with which data science team members interact with business units and among themselves. It is fairly common to have reporting structures evolve over time for the same company.

Partnering with External Teams

Whether an AI Team is centralized or decentralized, every organization should have access to external teams that can extend and augment their internal team’s competencies or add velocity to their development. The staffing requirements and technology infrastructure required to develop scalable AI solutions are often beyond the reach of many organizations. As explained in an earlier post, by turning to AI outsourcing and nearshoring, organizations can get the high-quality AI solutions they need cost-effectively.

Getting Started

Andrew Ng gives companies insightful advice on where to begin when it comes to building an AI team. He says, “Figure out how to jump in. Even if it’s just a junior programmer on small projects, get the wheels going and feel what an AI application can do. Companies tend to want strategies around everything. But if a company has never done an AI application, they can’t strategize properly, so C-suites develop strategies that look cut and pasted from a newspaper headline. Someone else’s strategy is rarely right for them. So, I say: Just hire a couple of engineers to see what they can do, and keep growing from there.”

Wovenware COO, Carlos Meléndez also suggests nearshoring as a fast path and short cut to AI deployment. By deploying AI as a service – paying as you go for the development of specific algorithms, companies can forego the costs of building an AI infrastructure hiring dedicated data scientists and developing private crowds. In an age when “build-your-own,” is being replaced with “as a service,” nearshoring is gaining renewed traction as a speedy vehicle to get on the AI fast track.

Scaling the Team

Building data science teams requires a combination of aligning the corporate culture and mindset with that of a knowledge-driven economy, providing executive sponsorship for innovation initiatives, finding the right talent, and combining cross-functional and technical expertise in the team. The success of artificial intelligence lies in looking beyond the data science role and following a holistic and practical approach to drive greater innovation.

Leave a Reply

  • (will not be published)