Data Lake-Ruihe Data Technology Holdings Limited-

Platform Function

Provide high reliability, high performance, scalable distributed storage system & scalable large-scale data processing capabilities

Adopting the Hadoop framework system can reduce the unit storage cost to a certain extent, while uniformly carrying massive amounts of structured, semi-structured and unstructured data

Provide a rich data calculation & analysis engine

Capable of multi-level fusion analysis of structured, semi-structured and unstructured data, including various computing engines such as batch processing, streaming computing, interactive analysis and machine learning

Have complete data management capabilities

Can manage all kinds of data-related elements, including data sources, data formats, connection information, data schemas, rights management, etc. Not only can store the original data, but also store the intermediate results of various analysis and processing, and completely record the analysis and processing process of the data, which can help users trace the generation process of any piece of data in complete detail

The key capabilities built include

Mixed Treatment

Support all types of data into the lake without pre-designed models, while supporting transactional and analytical data processing, data into the lake can be ad hoc analysis, continuous iteration

Federated Analysis

Support multi-type data format fusion analysis, without additional data relocation, cross-source data exploration calculation analysis can be realized through standard query statements

Elastic Scaling

The computing and storage layers can be independently and elastically expanded, with large-capacity storage pools and "theoretically" unlimited elastic computing resource capabilities to quickly respond to data and business changes

Hierarchical Storage

Supports automatic management of hot and cold data hierarchical storage data, rational use of storage, and cost reduction

Data Exploration

With integrated algorithm development capabilities, it can quickly build algorithm models and data exploration, and even integrate with standard database query statements. Support the use of standard interfaces to complete algorithm and AI business development

Data Release

Platform Advantage

item.01

More Intuitive Data Value

Before the realization of commercial realization of data applications, as far as the data itself is concerned, the inclusion of flexible but controllable data sharing tools and platforms will accelerate the collision of data within and outside the lake, within and outside the organization, and integrate and communicate to form a more complete The data panorama thus serves the business;

Incorporate data commercialization/social operation tools, such as data sandboxes, smart desensitization, autonomous subscriptions, usage statistics, etc., to leverage the value of data assets themselves

item.02

More Flexible Data Analysis

Incorporate the federated learning ability of "data immobility and computing dynamics" to solve the problems of data migration, data security and data rights and responsibilities; include the hybrid transaction/analysis processing architecture of "both data transactional and data analytical" to solve the problem The timeliness and consistency issues caused by importing the database into the data warehouse; including the ad hoc multi-dimensional analysis capabilities for the "large wide table", which solves the traditional multi-dimensional analysis that requires the data to be split and converted in advance according to the theme. Analysis of long links and low timeliness caused by

item.03

More Refined Asset MGT

Data can be stored hierarchically and hierarchically from different perspectives such as hot and cold data, business tags, etc. Under pre-defined data management and control rules and log-based machine learning operation and maintenance tasks, semi-automatic or even fully automatic data management can be achieved, and rational use System resources to realize "data autonomy"

item.04

More Smarter Data Access

In the era of big data, information has exploded further. Both the amount of data and the type and complexity of data are developing exponentially. The data lake can become the integration center of the entire data.

Through data perception technology, according to the type of data accessed, update frequency, data size, and preset usage scenarios, intelligently distinguish data access methods, automatically match underlying protocols and technologies, and reduce access to data lakes Threshold and overall operation and maintenance costs

Platform Value

Deeply mine the value of data to help enterprises implement digital transformation

Realize the management of data catalogs, models, standards, accountability, security, visualization, and sharing, realize centralized data storage, processing, classification and management, realize report generation automation, data analysis agility, data mining visualization, and realize data quality assessment, Landing management process

Meet the needs of data analysis applications at all levels of the enterprise

Use data lake intelligent analysis, data visualization and other technologies to realize data sharing, automatic daily report generation, fast and intelligent analysis, and meet the needs of data analysis applications at all levels of the enterprise

Data Lake