Principal Data Scientist

About us…

We’re a global intelligent automation consultancy in a period of hyper growth. Founded in 2020 in the UK, TQA has raised $20m in Series A and expanded to global offices in the US, Argentina, Romania and the Philippines. We’ve got a team of 140 people across the globe of extremely talented individuals who love to solve problems together. We passionately believe that automation has the power to expedite business transformation, enhance enterprise value and deliver a positive impact on people’s lives.

We’re looking for a highly skilled Principal Data Scientist to play a pivotal role in maintaining, upgrading and scaling our AI-powered systems; integrating these models into business workflows to deliver transformative, real-world applications and helping our clients thrive. If you’re passionate about Machine Learning, and thrive in a fast-paced environment - we want to hear from you.

This role will involve a diverse range of responsibilities that may evolve over time, requiring flexibility and adaptability as the business and its projects grow and change, offering the opportunity to work across multiple areas of data science, cloud infrastructure, and machine learning.

About you…

6+ years Data Science consultancy experience (PhD + 3 years, Masters + 4 years, or Grad + 6 years - the degrees ideally would be Computer Science, Data Science or other STEM):

3+ years client-facing experience
At least two projects delivered into production
Strong experience conducting Exploratory Data Analyses in customers of multiple different industry types and different types of data. Structured and unstructured, image, natural language, etc
Experience with supervised, semi-supervised and unsupervised ML
Able to flex to project and client requirements, which may necessitate pragmatic compromise in order to deliver an outcome
Knowledge of and use of publicly available accelerators to delivery, such as code repositories that can be used as a base, pretrained models etc.
Able to use Generative AI models and workloads to accelerate delivery
Experience with go-to-market offering creation and development, the presales and sales cycles and how to support them technically, and ongoing support of delivered projects

3+ years of experience in team leadership and management:

Head of/Lead-level positions
Must have had at least 2 direct reports (ie Line Manager) that were Data Scientists
Involved in growing teams - experience in hiring, firing, growth areas etc
Used to collaborating across geographies and timezones, adopting a ‘federated’-type approach - self-contained, but operating within and accountable for delivering against the overall global strategic priorities
Comfortable with flexing work hours to fit client requirements, such as UK timezones

The ideal candidate will have…

Experience with carrying out Data Science and ML workloads on AWS. These would include:

Amazon SageMaker (including training jobs, hyperparameter tuning jobs, inference endpoint deployment)
‘Core stack’ including Lambdas, S3, ECS etc.
Data transformation technologies, including Glue/Athena (but not expecting full expertise)
Strong knowledge of SageMaker Augmented AI and Ground Truth, leveraging its outputs to improve ML models
Serverless ML design patterns for AWS, including using Lambdas and Step Functions to orchestrate SageMaker

Python open-source Data Science stack, including:

Plotting: matplotlib, seaborn, plotly etc.
Data manipulation: pandas, numpy, scipy etc.
ML/Deep Learning: scikit-learn, pytorch etc.
Generative AI: OpenAI API usage etc.

Communication skills:

Able to present technical information to non-technical audiences
Able to demonstrate art-of-the-possible with SMEs that don’t know what is possible with their data, to understand what data might be available (publicly or otherwise) that the client might not know or think to use
Able to work with domain experts outside their skillset, which may include Solution Architects, Data Engineers, MLOps Engineers etc
Able to creatively solve problems, both human and machine, to deliver the greatest change and strongest outcome given the true landscape for a project or series of projects
Strong English language skills

Experience with taking on, maintaining, supporting and upgrading production codebases

Experience responding to tickets logged against business-critical systems
Able to triage issues, and attempt to solve, such as data quality issues affecting the output of a production ML workload
Able to communicate with and collaborate with experts in other area, such as Data Engineering, Web Development and MLOps, to solve issues that might span across multiple areas

Experience with Databricks to deliver Data Science outputs, ML models and production-grade solutions

Experience with multiple hyperscalers, especially Azure, and ideally Azure OpenAI
Strong experience with production, such as understanding how to implement ML into end-user-facing workloads and embedding them in business processes effectively
PoC to production experience with Generative AI, such as RAG deployment or GPT-X API integration into a user-oriented workflow
Data Warehouse/Data Engineering technology exposure or experience, such as configuring Databricks from the outset for a new environment

Sounds good? Here’s an outline of the responsibilities.

Team cultivation (5-10%):

Oversee the local team development (1:1s etc), hold country-wide DS Huddles to share knowledge and discuss the cutting edge
Create and refine go-to-market materials and offerings, such as how we will best leverage the Databricks best practices for our clients and how this dovetails into our broader Data offerings (working with the Data Science Director and other Data principals)

Managed service response (5-10%):

We have signed a Managed Services agreement with a major UK-based company. They have a break-fix with us, however a lot of the issues they report and identify actually stem from underlying data quality issues.
We want to identify these proactively, but also respond to the client with triaged and catalogued issues
You would be the main cover for the initial chasing to this identification/deidentification of being or not being a Data quality-related issue, such as a malformed field in a form being parsed as a string instead of an integer - this is simple but difficult to uncover without going through the codebase etc.

Research and development: (10-20%):

You would be working closely with our local and international teams to create accelerators and other IP that will enhance our offerings to clients
This involves staying at the state-of-the-art, such as reading and presenting the latest papers in journal clubs, as well as sharing more publicly with blogs and meetups
TQA understands the need for this to be part of your role officially, and as such it is

Delivery (60-80%):

For some current and new customers, you would run our AI Advisory and Data Science as a Service delivery models as the key expert DS. Initially, this will involve mostly a sole operation, but in time you will oversee teams of more junior colleagues who will support you across multiple projects.

Our offerings include:

- Customer-facing workshops, to identify and prioritise which use cases are most important and of highest value. To also create a plan of attack based on assumptions that will be later proved or disproved.

- AI Strategy consultation, to identify where companies currently are along their AI/ML journey and is this in concert with what they believe, and to resolve discrepancies therein respectfully

- Exploratory Data Analysis, on each use case for 1-2 weeks full-time depending on the complexity of the data and of the task. We seek to validate the assumptions in the stages beforehand, and ultimately refine our plan for proof-of-concept - this is also the stage where we and the customer can hit a breakpoint if either do not believe the ROI calculation passes muster

- Proof-of-concept, taking the use case and demonstrating its value in principle, in 6-8 weeks. Usually, this will involve the use of some AI/ML and/or Generative AI, but we would also usually not create a new model/paradigm from scratch, instead using existing machinery and templates available, such as Amazon SageMaker built-in models

After these stages, we would work with but handover to a dedicated 4-8 person team to take the project down the part of production, starting with a Minimum Viable Product (4-8 weeks), and then eventually working with the client's team to get it into production and operationalise the usage of the solution. You would still be leading this, but from a thought-leadership perspective and ‘engagement lead-like’ role, ensuring that there is continuity in oversight from start to finish of the solution development, but also with the support of Data Engineers, MLOps Engineers, Cloud Architects, Solution Architects, BI Engineers etc who each specialise in their respective areas and can work together to build the strongest possible solution within the constraints of the project

Working with Sales to create delivery models and offerings for a specific customer and specific project, however this still comes under Delivery as that is the main priority - Presales ought to be carried out in the guise of the other day flavours above but also within projects as continuance or upsell, ie as part of your Delivery role on the project

Principal Data Scientist

Colleagues

About TQA

Principal Data Scientist