Azure Data Lake Storage
E185662
Azure Data Lake Storage is a scalable, secure cloud-based data lake service from Microsoft designed for big data analytics and enterprise data warehousing workloads.
All labels observed (4)
| Label | Occurrences |
|---|---|
| Azure Data Lake Storage canonical | 7 |
| Azure Data Lake | 1 |
| Azure Data Lake Storage Gen2 | 1 |
| Azure Data Lake Storage Gen2 (managed by Power BI) | 1 |
How this entity was disambiguated
This entity first appeared as the object of triple T1647668 — resolving that mention is where its identity was fixed. The disambiguator weighed these candidate entities and picked the highlighted one (or “None”, minting a new entity). This is how homonymy is resolved: the same surface form can point to different entities.
Target entity: Azure Data Lake Storage Context triple: [Azure Synapse Analytics, integratesWith, Azure Data Lake Storage]
-
A.
Azure Blob Storage
Azure Blob Storage is a cloud-based object storage service for storing and managing large amounts of unstructured data such as text and binary files.
-
B.
Azure Synapse Analytics
Azure Synapse Analytics is a cloud-based analytics service from Microsoft that unifies big data and data warehousing to enable large-scale data integration, exploration, and business intelligence.
-
C.
Snowflake Data Cloud
Snowflake Data Cloud is a cloud-native data platform that enables organizations to store, integrate, and analyze data at scale across multiple clouds with a unified, fully managed service.
-
D.
Azure
Azure is Microsoft's cloud computing platform offering a wide range of services for building, deploying, and managing applications and infrastructure through Microsoft-managed data centers.
-
E.
Azure SQL Database
Azure SQL Database is Microsoft’s fully managed, cloud-based relational database service built on SQL Server technology for scalable, secure data storage and processing.
- F. None of above. chosen
- G. Unsure - the case is ambiguous/there is not enough information to decide.
Target entity: Azure Data Lake Storage Target entity description: Azure Data Lake Storage is a scalable, secure cloud-based data lake service from Microsoft designed for big data analytics and enterprise data warehousing workloads.
-
A.
Azure Blob Storage
Azure Blob Storage is a cloud-based object storage service for storing and managing large amounts of unstructured data such as text and binary files.
-
B.
Azure Synapse Analytics
Azure Synapse Analytics is a cloud-based analytics service from Microsoft that unifies big data and data warehousing to enable large-scale data integration, exploration, and business intelligence.
-
C.
Snowflake Data Cloud
Snowflake Data Cloud is a cloud-native data platform that enables organizations to store, integrate, and analyze data at scale across multiple clouds with a unified, fully managed service.
-
D.
Azure
Azure is Microsoft's cloud computing platform offering a wide range of services for building, deploying, and managing applications and infrastructure through Microsoft-managed data centers.
-
E.
Azure SQL Database
Azure SQL Database is Microsoft’s fully managed, cloud-based relational database service built on SQL Server technology for scalable, secure data storage and processing.
- F. None of above. chosen
Statements (67)
| Predicate | Object |
|---|---|
| instanceOf |
Microsoft Azure service
ⓘ
cloud storage service ⓘ data lake service ⓘ |
| basedOn | Azure Blob Storage ⓘ |
| developer | Microsoft ⓘ |
| hasFeature |
POSIX-like ACLs
ⓘ
data partitioning ⓘ diagnostic logging ⓘ encryption at rest ⓘ encryption in transit ⓘ fine-grained access control ⓘ folder and file-level security ⓘ geo-redundant storage options ⓘ hierarchical namespace ⓘ high throughput ⓘ lifecycle management policies ⓘ low-cost storage tiers ⓘ massively scalable storage ⓘ monitoring and metrics ⓘ role-based access control ⓘ |
| integratesWith |
Azure Data Factory
ⓘ
Databricks ⓘ
surface form:
Azure Databricks
Azure Event Hubs ⓘ Azure Functions ⓘ Azure HDInsight ⓘ Azure Machine Learning ⓘ Azure Purview ⓘ Azure Stream Analytics ⓘ Azure Synapse Analytics ⓘ Power BI ⓘ |
| offersTier |
archive tier
ⓘ
cool tier ⓘ hot tier ⓘ |
| partOf |
Azure
ⓘ
surface form:
Microsoft Azure
|
| provides |
big data analytics storage
ⓘ
data lake capabilities ⓘ enterprise data warehousing storage ⓘ |
| regionAvailability | multiple Azure regions worldwide ⓘ |
| securityModel |
Azure Active Directory integration
ⓘ
access control lists (ACLs) ⓘ role-based access control (RBAC) ⓘ |
| supportsFormat |
Avro
ⓘ
CSV ⓘ JSON ⓘ ORC ⓘ Parquet ⓘ |
| supportsLanguage |
dotnet/sdk
ⓘ
surface form:
.NET SDK
Java SDK ⓘ JavaScript SDK ⓘ Python SDK ⓘ |
| supportsProtocol |
Azure Blob Storage
ⓘ
surface form:
ABFS (Azure Blob File System)
HTTPS ⓘ NFS 3.0 ⓘ REST API ⓘ |
| supportsWorkload |
ETL pipelines
ⓘ
batch processing ⓘ big data analytics ⓘ data lakehouse architectures ⓘ data warehousing ⓘ machine learning ⓘ real-time analytics ⓘ |
| useCase |
IoT and telemetry data storage
ⓘ
centralized enterprise data lake ⓘ data lakehouse storage layer ⓘ data science sandbox ⓘ log and clickstream analytics ⓘ raw and curated zone storage ⓘ |
How these facts were elicited
The pipeline generated the facts above by prompting gpt-5.1 with this entity's name + description and the instruction below.
You are a knowledge base construction expert. Given a subject entity and a description of it, return factual statements that you know for the subject as a JSON list of dictionaries(triples), where keys must be "subject", "predicate" and "object". The number of facts may be very high, between 25 to 50 or more, for very popular subjects. For less popular subjects, the number of facts can be very low, like 5 or 10. # Requirements - If you don't know the subject at all, return an empty list. - If the subject is not a named entity, return an empty list. - Include at least one triple where predicate is "instanceOf". - Do not get too wordy. - Separate several objects into multiple triples with one object.
Subject: Azure Data Lake Storage Description of subject: Azure Data Lake Storage is a scalable, secure cloud-based data lake service from Microsoft designed for big data analytics and enterprise data warehousing workloads.
Referenced by (10)
Full triples — surface form annotated when it differs from this entity's canonical label.