Data Archives - 91��

Sustainability Data Analytics Platform for Implementing Sustainability 2.0

91�� — Fri, 13 Jan 2023 13:33:02 +0000

In recent years it has been witnessed that several industries are preparing to embrace sustainability — the management of greenhouse gas emissions, energy consumption, waste management, green product development, and water conservation — as an integral factor for their manufacturing and is no longer treated as an expense but as a crucial value differentiator. The manufacturing industry is now introducing sustainability 2.0 as an integral part of its business model and core strategy. Launching sustainability 2.0 is to improve long-term sustainable goals and ESG (environmental, societal governance) challenges. The need to shift to more sustainable business operations is highly critical.

91�� is on a journey towards Sustainability 2.0, an agenda that reinvents sustainability under the compelling challenges of climate change and social inequity. This new agenda is driven by a remarkable combination of thought and action on 91��’s part with meaningful public-private-people partnerships. The Sustainability Report 2.0 is available .

The Need for a Sustainability Data Analytics Platform

Research by states that 86% of business leaders have invested in sustainable practices to protect their organizations from disruptions. However, reporting on sustainability initiatives and finding various data types is difficult for all relevant parties to access. In order to drive sustainability performance management, the data analytics platform is vital.

Why is Sustainability Data Analytics Platform Necessary?

Unable to detect human errors during data collection – Leveraging AI-driven document processing methods
Unable to trace and audit data – Ensuring end-to-end traceability by customizing system logs as user-friendly. Approvers can easily interpret and make approval decisions with sustainability data reviews
Time-consuming while aggregating data – Introducing the power of data aggregating capabilities by customizing industry-standard data management tools
Inconvenience while accessing sustainability reports – Enabling self-serving capability with ideal role-based access control for different stakeholders
Difficulty in getting the approval of reported sustainability data – Leveraging integrated workflow for end-to-end automation of processes ranging from data preparation to approval
Proactive methods needed for meeting sustainability targets – Introducing machine learning models with the help of predictive analytics tools for preventive actions to meet the defined sustainability goals
No single source of truth for sustainability reports – Introducing a self-service platform for accessing sustainability artifacts

How Will the Sustainability Data Analytics Platform Help?

Integrated platform with low environment code to address the above-mentioned pain points
Single source of truth for stakeholders for complete transparency of sustainability reports
Secured access to data across stakeholders, data providers, reviewers, sustainability officers, and external auditors
Automated workflow for end-to-end data management and reporting
Complete traceability of data preparation, reviews, corrections, and approvals
Complex calculations to arrive the critical sustainability measures across environmental footprints – energy, emission, water, and waste
Prebuilt data model to reduce the time of implementation
Various connectors to extract data from scanned documents, PDFs, enterprise resource planning, and excel sheets

How is the 91�� Sustainability Data Analytics Platform different from other platforms?

Here are the key differentiators that make 91�� a strong player among others:

Single Version of Truth
Pre-built data model suitable for manufacturing and consumer product companies
Automation of data consistency, accuracy, and completeness verification in the data collection process
Inbuilt predictive insights for corrective actions to meet sustainability targets
Role-Based Access to Data
Accessible roll-out features across businesses
Configuration options for third-party audits
Full Stack Solution ensures data collection till sustainability reporting and analytics
Easy and convenient to deploy across different business units and facilities
The functional view of each of the components is depicted below:

The 91�� Sustainability Data Analytics Platform is based on cloud technology, favouring the resource-efficient use of IT resources, and enabling every company’s flexible and expandable growth. With integrated solutions, companies can introduce sustainability goals more confidently and cleanly into their daily activities and move strategically toward building more resilient, sustainable businesses.

The post Sustainability Data Analytics Platform for Implementing Sustainability 2.0 appeared first on 91��.

Data Modernization – What is the best route for your transformation journey? (Part 2)

91�� — Tue, 30 Aug 2022 05:36:46 +0000

So, you have taken the decision to go in for a data modernization exercise, which befits any forward-thinking organization. That’s the good news!

The question now is what is the way forward? What is the most appropriate model for your organization?

The truth is that there is no one-size-fits-all solution. Over the last decade, Data Lakes grew to be the de facto model for modernization. These days, they are being supplanted by, or in many cases have been subsumed into, Data Meshes. Both models have their votaries, and both come with their own set of challenges.

Let us examine these two models in a little more detail so that you can wrap your mind around them more easily and be better positioned to choose between them.

The Data Lake

A Data Lake is a large reservoir into which raw data can be poured and stored until needed. Thanks to its flat architecture, it stores data in its native format, as binary large objects (blobs) or files. It takes in unstructured data, such as emails, documents etc.; binary data like images, audio, and video; semi-structured data, such as CSV, logs, and XML; and structured data from relational databases. The extract-transform-load process happens within the Data Lake itself.��

The Data Lake can, therefore, efficiently manage the high Volume, high Variety, and high Velocity of Big Data. It also significantly enhances the value of Big Data by making it available as reports, dashboards, and applications, to facilitate better visualization, advanced analytics, and machine learning. All, of course, to ultimately empower organizations with the ability to take evidence-supported business decisions with more far-reaching impact than ever before.��

Being a single, integrated, and complete system, the Data Lake facilitates faster and simpler development of applications as well, which are based on one code.

The Data Lake can reside on the cloud, on a platform such as Microsoft Azure, or as a distributed file system such as MS SQL Server with the Hadoop Distributed File System.

However, Data Lake also has its drawbacks.��

As the volume of data increases and grows more complex, the central IT function becomes overloaded with requests and cannot keep pace. Individual project teams then try to bypass it and deploy quick fixes that are poorly integrated and create problems in the future.

What is worse, organizations keep pouring data into the Lake and eventually lose track of what it contains. Much valuable information can go unnoticed because data analysts have no knowledge vis-à-vis the data’s source domain and engage in fishing expeditions.

Many organizations have seen their Data Lakes turn into data swamps because, after a point, it entails considerable technical and organizational effort to make productive use of them.

The Data Mesh

The Data Mesh evolved in response to the many challenges that the Data Lakes posed.

Unlike the Data Lake, the Data Mesh is a composite ecosystem, not a monolith. It breaks giant, monolithic enterprise data architectures into decentralized subsystems, each owned and managed by a dedicated team.��

The Data Mesh facilitates the management, connection, and smooth flow of data from producers through to consumers, whether outside or within a Data Lake. In that sense, a Data Mesh may include Data Lakes.

Data Meshes can be said to have four pillars:

Decentralized Data Ownership

Data is owned by the entity that produces it, typically functions such as HR, Finance, Marketing, etc. Therefore, more value can be derived from it. Typically, tools such as Azure Databricks are used to process the large workloads of data.

Data as Product

Users, such as data analysts, can easily source data directly from the domain owners, who will ensure that the data is of high quality. Conflicts are eliminated by using approaches like event sourcing and CQRS.

Self-serve data infrastructure as a platform��

Domain teams can create, transform, and consume data products autonomously.��

Federated governance

Mandated universal standards to enable smooth interoperability and flow of data.

The Data Mesh brings many benefits to the table

Flexibility and Choice – Since its architecture is domain driven and distributed, you have the flexibility to choose vendors and technologies that work best for you, without getting locked onto one platform.��

Greater agility, seamless collaboration, shorter project times – Since domain teams own their data, they can operate independently, making them more agile and responsive. At the same time, since the teams are cross-functional, collaboration becomes simpler and more efficient. Development accelerates and projects go live faster!��

Superior quality – Since ownership is vested with domain experts, the quality of the data is always high. Further, by mandating universal protocols and principles, the Data Mesh promotes the delivery of data in standardized formats for easier access.��

Quick service: Data producers and data users interact based on pre-determined SLAs, which enables much faster delivery of data. All data management needs such as storage, logging, identity management, and such, which slow the process down, are handled by the Data Mesh’s inbuilt capabilities.��

Scalability: Being distributed in structure the Data Mesh is also eminently scalable with minimal disruption.

So, should your company upgrade to a Data Mesh?

A Data Mesh certainly sounds like a panacea for all data ills but, like all technology solutions, it must be opted for after due thought and diligence. Keeping the following factors in mind will help you make a better-informed decision about whether your organization needs to upgrade to a data mesh.

Duplication of data: Repurposing data to serve another domain’s needs may lead to data duplication. This can lead to higher storage requirements as well as increased data management��costs.

Quality Avoidance: The availability of multiple data products and pipelines may lead to non-compliance with governance standards. Therefore, these principles will need to be clearly articulated and compliance enforced through appropriate measures at the domain level.��

Change management efforts: Deploying data mesh architecture and decentralized data operations will entail organization-wide change management efforts. You will need to plan to allow for business disruptions and to ensure that critical operations continue.

Choosing future-proof technologies: Teams will have to think long term when selecting technologies that will be standardized across the company, to ensure easier future upgradation with minimal disruption.

Cross domain analytics: Reporting becomes decentralized as well, and a separate organization wide model may need to be defined to consolidate diverse data products into one report.

Talk to us at 91��. We’ll undertake an assessment of your existing digital landscape, identify modernization areas, build a strategic roadmap, and define the enterprise architecture you need.��

Click here for Part 1 of blog: Modernize the Data Ecosystem to Lay the Foundation of an Insights-driven Digital Next Enterprise (Part 1)

Reference:

Zhamak Dehghani
Data Mesh Founder

Author:

Bhagaban Khatai
Data Transformation Leader

The post Data Modernization – What is the best route for your transformation journey? (Part 2) appeared first on 91��.

Modernize the Data Ecosystem to Lay the Foundation of an Insights-driven Digital Next Enterprise (Part 1)

91�� — Tue, 28 Jun 2022 10:43:56 +0000

Data modernization has become an urgent competitive necessity for businesses to stay ahead of the curve – anticipate market changes earlier, understand customer needs more closely, and take and implement winning decisions faster than the competition.

That said, technology leaders need to assess the pros and cons of a modernization exercise. Businesses must study the various avenues for modernization and choose the one that gives them the best cost-benefit balance. As with any change management initiative, it is disruptive and entails focused deployment of resources.

In this article, I will discuss three frameworks/platforms that, we at 91�� have helped our clients use to effectively leverage data for business success.

The Data Warehouse

The Data Warehouse was probably the first enterprise-level platform to use data for business decision support. It came into its own in the Nineties and at the turn of the new Millennium. As its name implies it organized data in structured and labelled fields that could be easily accessed, and it worked excellently.��

Data-driven business intelligence, as a concept, gained massive leverage thanks to the Data Warehouse. However, like its counterpart in the real world, the Data Warehouse’s key drawback is poor scalability. It works on pre-built schema and can take in only structured data. As a result, the data is siloed and not all data is captured.��

As the three Vs of data – volume, variety, and velocity – grow, as in today’s age of Big Data, the Data Warehouse becomes unwieldy and inefficient. And data’s fourth V, veracity, suffers in consequence.

This is not to say that the Data Warehouse has outlived its utility. It still works efficiently for businesses that deal with a smaller volume and variety of data and provides excellent decision support intelligence at a relatively lower investment.��

The Data Lake

The Data Warehouse’s inherent problems gave rise to the Data Lake, a platform with no hierarchical structure that is more attuned to the needs of Big Data.��

A data lake is like a reservoir into which raw data can be poured and stored until needed. It has a flat architecture and takes in data in their native formats – emails, documents, images, audio, video, semi-structured data, such as CSV, logs, and XML, as well as structured data from relational databases.��

The extract-transform-load process happens within the Lake itself and data is presented as reports, dashboards, and such, to facilitate better visualization and more accurate analytics, as well as to enable machine learning.��

The Data Lake is thus capable of managing the high Volume, high Variety, and high Velocity of Big Data.��

However, the Data Lake also has its drawbacks.��

Once data is put into the Lake, it becomes monolithic. This limits the knowledge that data analysts can gain from it and increases the risk of valuable information going unnoticed.

Its centralized control structure stretches the IT team thin. Projects get delayed, forcing teams to resort to poorly integrated ‘quick-fix’ solutions that eventually compound problems.

Consequently, it often ends up as a huge unmanageable data dump yard. Drawing any useful sense out of the Data Lake becomes a complex, expensive, and resource-intensive task.��

It is in response to these problems that the concept of a Data Mesh came into being.

The Data Mesh

Unlike the Data Lake, the Data Mesh is a composite, integrated ecosystem, and not a monolith. It is composed of decentralized subsystems or domains, each managed by a dedicated team. In a sense, you can say that the Data Mesh as a whole is greater than the sum of its parts.

It thus offers several advantages over the Data Lake.

It makes domain experts owners of their data. Thus, there is no danger of valuable nuggets of information being lost or ignored.��

It treats data as a product and enables a smooth and secure flow of data from producers to users, whether outside or within a Data Lake. In that sense, a Data Mesh may include Data Lakes.��

It encourages cross-functional teams and empowers them to operate independently, with little or no support from a central IT function. Collaboration is more efficient, the pace of development accelerates, and projects go live much sooner.

Its decentralized approach gives you the flexibility to choose vendors and technologies that work best for you, without getting locked onto one platform.��

A Data Mesh can be deployed for a broad range of needs and for diverse use cases:

Migrating applications to the cloud
Modernizing data lakes to make data more easily accessible
Integrating apps, IoT, and analytics in real-time
Streaming data pipelines within or from data lakes
Data-in-motion analytics

So which modernization solution is best for your organization?

We are part of the Data & Analytics transformation journey over last 15 years

Click here for Part of 2 blog: Data Modernization – What is the best route for your transformation journey? (Part 2)

Author:

Bhagaban Khatai
Data Transformation Leader

Reference:

Zhamak Dehghani
Data Mesh Founder

The post Modernize the Data Ecosystem to Lay the Foundation of an Insights-driven Digital Next Enterprise (Part 1) appeared first on 91��.

Keeping your data protected from ransomware attack in the new era

91�� — Mon, 03 Jan 2022 07:11:34 +0000

As per , Ransomware was the top threat type, comprising 23% of attacks.��In 2019, the U.S. was hit by an unprecedented and unrelenting barrage of ransomware attacks that impacted at least 966 government agencies, educational establishments and healthcare providers at a potential cost in excess of $7.5 billion (). Average Data breach costs increased significantly from $3.86 million in 2020 to $4.24 million in 2021 (). Ransomware attacks cost an average of $4.62 million, more expensive than the average data breach ($4.24 million). Malicious attacks that destroyed data in destructive wiper-style attacks cost an average of $4.69 million.

The number of organizations deciding to pay a ransom has risen to 32% in 2021 compared to 26% in 2020 (). Even after paying for Ransomware, only 8% of them got all their data back, nearly a third, 29%, couldn’t recover more than half the encrypted data. However, on average, only 65% of the encrypted data was restored after the ransom was paid. Approximately 37% of global organizations (more than one third) said they were the victim of some form of Ransomware attack in 2021 (IDC’s “). 92% who pay don’t get their data Back (.

We all know that Confidentiality, Integrity and Availability are the 3 pillars of security. Integrity of Data is an important dimension, which means that data has not been altered in an unauthorized manner when data is “at rest, getting processed, or in transit”. Here we will be focusing only on “at rest” Data related to Ransomware. It’s evident that, while there is a high level of efforts required to prevent “Attackers from getting in” or “escalating their privileges within system” the best bet for an organization remains to “Protect their critical data from unauthorized access and destruction”.

Ransomware attacks focus on encrypting any data to which they could get write access, including the backup system. This may also happen due to poorly implemented permissions that exposed backup data stored anywhere. This makes Ransomware attack more effective because organizations can’t recover data from backup systems.

The big step towards getting data protected is to have isolated, immutable backup of data which is not accessed in general and have very strict administrative access authentication, authorization adjustments for a set of admins. There was a time when data backup on physical tapes were kept off-site to be protected from any Data center physical damage as part of BCP/DR approach. That was one of the best ways to ensure data integrity. We should leverage Cloud offerings which are equally effective to protect data from any Ransomware attack.

Now let’s discuss about immutable backup methods which will be the key ask here. Azure has introduced Blob storage options to operate like an Immutable storage and enables users to store business-critical data in a WORM (Write Once, Read Many) state for a defined time interval. While in a WORM state, data objects can be created and read, but cannot be modified or deleted for a user-specified interval. By configuring immutability policies for blob data, customers can protect their data from overwriting and deletion. Another benefit of Azure Blob storage is having a legal hold, which stores immutable data until the legal hold is explicitly cleared. When a legal hold is set, objects can be created and read, but not modified or deleted. It’s important to understand how immutability is implemented and whether it is truly��WORM, even if OS administration accounts are compromised.

Those who are on AWS platform, can use AWS Backup Vault Lock to prevent (accidental or malicious action) any user from deleting their backups or making changes to their backup lifecycle settings. AWS Backup Vault Lock (S3 Glacier) improves customer’s security postures and ensures a mechanism for restore, even in a worst-case scenario like total account compromise. Another service that’s useful for data protection is the AWS object storage S3, where you can use features such as��object versioning��to help prevent objects from being overwritten with Ransomware-encrypted files, or��Object Lock (S3), which provides a write once, read many (WORM) solutions to help prevent objects from ever being modified or overwritten.��

You can use Compliance��retention mode if you never want any user, including the root user in your AWS account, to be able to delete the objects during a pre-defined retention period. You can use Legal Hold as an infinite retention period. Once applied it is not possible to delete any object until the hold is released manually (only by users with special permissions). Every backup within the retention period is an immutable backup with point-in-time restore capabilities. Also, we have S3 MFA delete-enabled bucket option which safeguard from permanent delete of an object version or change the versioning state of the bucket.

Similarly, GCP storage containers with Bucket Lock offers write-once (WORM), immutable storage to meet your compliance standards and ensure your data’s integrity while offering instantaneous access for quick restores. As part of protection, once you lock a bucket, you cannot unlock it until all objects are out of the retention period. Retention policies prevent the deletion or modification of the bucket’s objects. Applying Bucket Lock��to a storage bucket in the Archive class can help you achieve WORM compliance for long-term data archival as well.

Apart from introducing immutable backup options which provide a secure storage for your data, we all know initial steps such as to keep multiple copies of data backup (keeping data off-site), use a standard practice of Multi-Factor Authentication (MFA) for administrative accounts, separation of administrative roles. We also need to enable encryption of the data and segment the workflow so that authorized systems and users have limited access to use the key material to decrypt the data. We know that network sharing protocols work well for general-purpose file sharing. However, minor mistakes in permissions can lead to data being exposed. In place of using them, we recommend using object storage APIs, for example Amazon S3 compatible APIs, virtual tape libraries, or keep storage as “local” to the backup server (do not access over a network sharing protocols).

Interesting part to understand here is, the technology which was initially introduced for a security compliance requirement to keep the golden copy of any data for later auditing or reconciliation, has taken a shift to be also used to safeguard from ransomware attacks to maintain integrity of data. Cloud storage is an economical solution because resources are readily available, is scalable, and multi-tiered.

Author:

Deepak Kumar,
Cloud Practice Head,
91��

The post Keeping your data protected from ransomware attack in the new era appeared first on 91��.

Data Archives - 91����

Sustainability Data Analytics Platform for Implementing Sustainability 2.0

Data Modernization – What is the best route for your transformation journey? (Part 2)

The Data Lake

The Data Mesh

Modernize the Data Ecosystem to Lay the Foundation of an Insights-driven Digital Next Enterprise (Part 1)

The Data Warehouse

The Data Lake

However, the Data Lake also has its drawbacks.��

The Data Mesh

So which modernization solution is best for your organization?

Keeping your data protected from ransomware attack in the new era

Data Archives - 91��