Tonic.ai editorial

Data is the new code: the evolution of software development

Author
Ken Tune
Author
July 2, 2025

The rise of data engineering in the domain of software engineering

The impact of the latest wave of artificial intelligence innovations is undeniable. Hype aside, surveys such as McKinsey’s ‘Open Source technology in the age of AI’ confirm pervasive usage of AI technology with more than half of respondents making use of models, tools and incorporating prompt handling into their applications.

Usage is highlighted as with training costs in the billions of dollars, very few businesses will be engaged directly with the development of AI capabilities. Instead nearly all will be applying these innovations to their own products.

To understand what is happening in practice, we can regard the AI components as the processing elements in the systems analysis flow depicted below. Interesting though AI components are, they are not the subject of this article. Instead we consider the input. In the context of software and service development the input is our data. In this brave new world of AI-enabled features and services, for most businesses the critical item to consider is the fuel for the engine, the ingredients for the kitchen, the data itself.

Is data the new code?

As with nearly all Snowclones and respectful of Betteridge’s law of headlines the immediate answer is of course, no, it’s not. Better however to move beyond a literal response and consider what is really meant by the question. There are a number of secondary queries that we are implicitly encouraged to consider.

  • Has the role of data assumed greater importance in the sphere of software and service development?
  • Is the work of managing data now a much more significant part of the development process?
  • Is it true to say that the behaviour of software is now governed by the data resources to a much greater extent than it was in the pre ChatGPT era?

The answer here is very much yes. We are visibly surrounded by AI enabled software and what differentiates one AI based application from another is the data that it is trained with and utilises.

Examples of data-driven products

The following give a flavour of the broad reach of data-enabled applications in business today.

  • Music streaming: 35% of listening on Spotify followed algorithmically generated playlists according to 2023 research from yourmusic.marketing. 
  • Sales / CRM: A decade ago, Salesforce set itself the goal of being an ‘AI-first Company’. Today the platform offers a wealth of AI-powered functionality including analytics and generative features whose creation has necessitated training and testing using large scale data resources. 
  • Clinical notetaking software Nabla was trained using, with consent, patient data derived from 30,000 visits to specially commissioned virtual clinics. 
  • Programmatic advertising / marketing: This has long been data driven. For a comprehensive overview, see the AdTech Book from ClearCode.
  • Retail: Most evident in Amazon’s use of recommendations and automated review and product summaries, data-driven features are ubiquitous in online retail.
  • Insurance: For a glimpse of the future, consider how insurers like Allianz believe AI might transform risk modelling and claim processing.

Why data engineering is critical to the future of software

In each of the above use cases we can see how data is an equal partner with code and therefore there is a competitive advantage available to those who best manage and harness the data resources they have. Unsurprisingly we have therefore in recent years seen the rise of the data engineering professional. A useful summary of the skills needed in this domain is provided by Coursera.

  • Acquire datasets that align with business needs
  • Develop algorithms to transform data into useful, actionable information
  • Build, test, and maintain database pipeline architectures
  • Collaborate with management to understand company objectives
  • Create new data validation methods and data analysis tools
  • Ensure compliance with data governance and security policies

It is interesting to note that a data engineer works with a high level of autonomy, which is what in part makes this such an attractive role. In nearly all the activities above there is considerable freedom available to the practitioner. The exception however is the final activity. Usage of data is very much conditional upon the ability to satisfy legislative and policy requirements and navigating usage challenges successfully can be the difference between an unrealised initiative and one that sees the light of day. The level of concern at the highest levels is shown by this recent Forbes article listing 20 contemporary challenges for CEOs—ethical, safe and secure usage of data surfaces in ⅓ of the areas listed.

Tonic.ai: a suite of tools for unlocking corporate data

By anonymising while still preserving the character, structure and distribution of data, the Tonic.ai product suite allows data to be transformed in such a way that it can be used to safely power initiatives that will provide competitive differentiation. The ability to successfully manage the data resources available will in part determine which businesses thrive in the contemporary economy. 

Whether you need structured data de-identification for testing and development, unstructured data synthesis for AI model training, or synthetic data from scratch for new product innovation, we’re here to help. Connect with our team today to equip your developers with the data they need.

Ken Tune
Senior Solutions Architect
Ken Tune is a Senior Solutions Architect at Tonic.ai. He advises major companies across the EMEA region on the unique value Tonic can bring to their business, guiding them from introduction to adoption. Prior to that he held Solution Architect positions at Aerospike and Aiven and was Senior Principal Consultant at MarkLogic, being responsible for guidance and implementation of over 30 separate deployments in total. Additionally, he has a wealth of experience in finance, having worked for Hambros Bank, HSBC and Markit Group with experience including risk management and major system integration. He has a BA in Maths from Cambridge University and an MSc. in Computer Science from Imperial College, London.

Make your sensitive data usable for testing and development.

Unblock data access, turbocharge development, and respect data privacy as a human right.
Accelerate development with high-quality, privacy-respecting synthetic test data from Tonic.ai.Boost development speed and maintain data privacy with Tonic.ai's synthetic data solutions, ensuring secure and efficient test environments.