This year’s Strata Data Conference, held in San Francisco the last week in March, was the 9th consecutive Bay Area-hosting of several thousand data and analytics attendees who gathered for presentations, tutorials, networking, and discussions with new and established vendor exhibitors. The main themes this year centered on data engineering, deep learning, and machine learning in the context of how to Make Data Work. The keynote presentations provoked inspiration and imagination for what the world is doing and can do because of the AI revolution, and keynote speakers shared examples of AI, advanced cryptography techniques with AI, and how AI is used behind the scenes EA for video games, just to name a few. But for all the excitement and opportunity, speakers also acknowledged challenges existing with AI; we listened to authors talk about AI influence in the world of cyber-attacks, risks of feared black box algorithms, and odd anecdotes highlighting difficulties while using AI to design chatbots. Yet attendees were clearly interested in how to solve the challenges data professionals face, and with thousands of positive opportunities for infusing AI in everyday life, the need for data engineering and machine learning will continue to grow at a rapid pace.
Ben Lorica, O’Reilly Media chief data scientist and conference chair, shared details of the recent O’Reilly survey, “Evolving Data Infrastructure,” which explored the technologies respondents are using and what they are building. Results support that ML will grow even more quickly in the next five years, and as 5G becomes widespread we’ll see more machine to machine applications. A role of “machine learning engineer” is cropping up to sit between data science and engineers and bring models to production. Of course, architecture, integration/data unification, tools, and governance are all prime focus areas to support and sustain analytics efforts.
Cloud architecture continues to be top of mind to support data and analytics needs, and it has matured and evolved in several ways. Sessions and conversations focused on the essence of “cloud-native” architecture, with separation and elasticity of compute and storage in managed services for solutions and platforms that take advantage of this architecture pattern (rather than running applications in virtual machines in the cloud). Once this architecture pattern has been recognized as a common denominator, it reframes another dominant conversation for how to architect hybrid-cloud and multi-cloud technologies. These discussions developed from past cloud adopters and how they have matured into the reality that hybrid and multi-cloud implementations are what enterprises need. Similarly, operationalizing these hybrid environments and application fluidity continue to drive more awareness and adoption of Kubernetes and Lambda implementations.
Complexity drives the need for abstraction in management and usability when emerging technologies get crowded in the enterprise and move from an emerging technology to operationalizing technology. Vendors and data professionals recognize the need to simplify the architecture for enterprise adoption of technologies and approaches. We saw this theme at the forefront with several vendors, including Min.io, which standardizes object stores across environments and storage for AI, and Cloudera, which is unifying their data engines and merging Hortonworks DataPlane Service for hybrid data management.
Hybrid cloud architectures reveal the need for a balanced equation between cloud services and their on-premises counterpart. Modernizing data and analytics platforms in the enterprise must also include modernizing on-premises architecture rather than solely migrating to the cloud. Open-source technologies, such as Spark and Kafka, are recognized for their ability to be deployed on-premises as well as in public clouds and providing architecture portability that help address concerns about cloud lock-in when using cloud providers’ proprietary services. Leading vendors in this space, like event sponsor Cloudera, are clear that their data platforms are intended for hybrid and multi-cloud fluidity.
Data democratization is becoming much less of a myth or holy grail as several sessions explored companies’ journeys to enterprise adoption. In addition to scalable, intuitive tools, innovative techniques and approaches help combat the ongoing struggle to enable analysts and analytically minded businesspeople and connect more people with data. Presenters recommended a “decision culture” built from the bottom up to establish and sustain a user-centric, self-service mindset. Enterprise data lakes continue to thrive as modern analyst tools with data catalog features and data governance empower more business analysts across various skill levels.
We expect these themes to continue to play out in the industry and look forward to the spread innovative ideas, approaches, and technologies at the next Strata Data Conference in the United States, scheduled in New York City September 23-26, 2019.