In a groundbreaking move, Subhodip Pal has spearheaded a transformative initiative to overhaul the critical IT systems of a prestigious university. Mr. Pal’s journey involved replacing an outdated and expensive Oracle Master Data Management (MDM) system with a state-of-the-art Graph Technology based architecture, marking a paradigm shift from traditional RDBMS and monolithic architecture to a modern cloud-native, scalable, containerized microservices framework.

The university’s previous MDM system was not only financially burdensome, requiring a staggering $1M+ annually just to keep the lights on, but also lacked the flexibility to support new enhancements and scalability. Mr. Pal’s vision led to the creation of a system that proved to be 10X Cheaper in terms of TCO, more than 10,000 times faster and almost infinitely scalable, addressing the university’s immediate needs and capable of handling backlogged Data Science and AI Use Cases without any additional system or infrastructure. The technology stack for the solution includes Neo4j (leading Graph Technology), Kafka (for Near Real Time Streaming), Docker (for Microservice Containerization), Cloud Platform, Loqate (for Address Verification), React (for Data Steward UI) and Prometheus/Grafana (for advanced monitoring).

One of the key advantages of Mr. Pal’s Neo4j-powered MDM system is its ability to provide near real-time data streaming and data observability capabilities. This ensures that the university’s critical functions, such as student enrollments, operate seamlessly and efficiently. Mr. Pal’s team has successfully executed the last three term enrollments flawlessly, demonstrating the system’s reliability, scalability and performance.

Data Observability is a very nascent and upcoming area in the overall Data Strategy aspect and back in 2020, when the system was being designed and built it was one of the trailblazers in this space demonstrating how Data Observability across a Microservice Architecture can effectively and very efficiently assist in understanding the bottlenecks or the point of failures.

Traditional MDM tools lack flexibility and usability from the Data Steward and operational point of view. Leveraging React JS (React is a free and open-source front-end JavaScript library for building user interfaces based on components. It is maintained by Meta and a community of individual developers and companies) based framework on top of the Graph Database provides the flexibility, usability, portability and cross browser /mobile device support.

Neo4j powered by Lucene Index provides a fast and flexible Search capabilities and this capability was exposed as an API to enable a scalable and blazing fast Data Discovery and Search Match capability across the University, including the University’s website. The new platforms Average Search Match throughput is around 100 -200 milliseconds compared of couple of minutes in the legacy application.

The Graph Data Science (GDS) library of Neo4j provides efficiently implemented, fine-tuned production grade graph algorithms. Mr. Pal’s vision was to leverage these algorithms on top of Cleansed, Governed, De-duplicated and Survived Golden Data (curated by business-driven survivorship rules) to drive cutting edge applications in the Higher Education industry to create a knowledge graph around each student’s journey, perform Prescriptive and predictive analytics, target alumni donations effectively, automated admission rank and score etc.

Pal’s Neo4j-Powered IT Revolution: From DevOps Agility to Gartner Validation

A notable aspect of Mr. Pal’s approach is the incorporation of a DevOps (Gitlab) platform with continuous integration and continuous deployment (CI/CD) features, enhancing the agility of the university’s IT evolution, introduction of new features and operations. This, coupled with the open-source nature of the framework, makes the system nimbler in addressing security concerns, data masking and vulnerabilities. Mr. Pal’s team demonstrated their responsiveness by addressing a recent widespread Log4j-based vulnerability in under 15 minutes, showcasing the robustness and security of the implemented architecture.

During the ideation phase, Mr. Pal’s concept received independent validation from Gartner, a renowned research and advisory company. The Director of Advanced Analytics at the university expressed excitement about the tool’s potential and its alignment with industry trends. According to Gartner, major MDM players are expected to adopt graph-based solutions within the next five years, highlighting the forward-thinking nature of Mr. Pal’s Neo4j-based approach.

Generally, the TCO for Master Data Management initiative is typically around $3M to $15M for a 5-year period which includes Initial Implementation, Software Licensing fee, Professional Services Fee, Infrastructure fee, additional Licenses requirement for 3rd Party Services like D&B, Address Verifications etc. Mr. Pal’s extensive research on Deep Learning and applying mathematical models like Levenshtein distance, Jaro–Winkler distance etc. on Higher Education specific requirements & data has now enabled all the features of a traditional Master Data Management Tool on a modern Cloud Native, Data Science ready platform while drastically reducing the TCO. For reference, the initial implementation along with the Neo4j license cost was under $500K with additional $80K/year for Support thus bringing the TCO well under $1M for a 5-year period.

The success of Mr. Pal’s initiative extends beyond immediate benefits, as highlighted by the University’s Director of Advanced Analytics, who sees the potential for the graph database to derive affiliation information, create recommendation engines for communication and marketing purposes and the enabler for adopting AI, Knowledge Graph and Graph Data Science. The project, endorsed by Gartner, is set to bring new capabilities to the university, ushering in a new era of efficiency and innovation in higher education IT.