This three-part series demonstrates how to use Azure Cosmos DB to build an Intelligent App that uses historical pricing and product data to forecast future price fluctuations for specific products. In the first article of this series, you’ll set up and populate the Cosmos DB database with data to use in the later parts of the series.
Dynamic Repricing of Products Using Intelligent Apps Part 1: Setting Up and Populating Cosmos DB with Data
Intelligent Apps leverage data and artificial intelligence (AI) to provide smart, personalized, and adaptive experiences for users. AI and machine learning (ML) techniques like natural language processing (NLP), computer vision, and deep learning help understand context, intent, and user preferences to deliver relevant and timely insights and actions.
Some examples of Intelligent Apps include:
- Virtual assistants—Interactive applications that understand and execute user commands
- Chatbots—Automated messaging systems that provide information or assistance
- Recommendation systems—Algorithms that suggest relevant items based on user preferences and behavior
In this three-part series, you’ll create an Intelligent App powered by Azure Cosmos DB and AI/ML capabilities that dynamically suggests changes to product prices based on demand and historical trends. This app will help optimize revenue and customer satisfaction by adjusting product prices according to market conditions and customer behavior.
Laying the Groundwork for an Intelligent App with Cosmos DB
First, you’ll set up an Azure Cosmos DB database and populate it with product data and historical information about sales and demand. In part 2, you’ll analyze this data using AI and ML to forecast and suggest price changes.
Prerequisites
To follow this tutorial, ensure you have the following:
- An Azure account
- A Kaggle account to download the dataset this tutorial uses
Create an Azure Cosmos DB Account
Azure Cosmos DB is a fully managed multi-model database that ensures fast access to data, easy scalability, reliable uptime, and strong data consistency. Cosmos DB supports various data models and APIs, including SQL, MongoDB, Cassandra, Gremlin, and table storage, making it easy to query and manipulate data using familiar tools and languages.
Although you already have an Azure account, you also need to create an Azure Cosmos DB account by following the steps below:
-
Sign in to the Azure portal.
-
Click Create a resource on the upper-left side of the page.
-
Search for “Azure Cosmos DB” and select it. On the Azure Cosmos DB page, select Create.
-
Enter the settings for your new account:
-
Select your desired subscription.
-
Create a new resource group or select an existing one if you have one you’d like to use.
-
Enter a unique account name.
-
Select SQL (Core) as the API. This is the default API for Azure Cosmos DB and allows you to use SQL syntax to query and manage your data.
-
Select a Location for the account.
-
Click Review + create.
-
-
Review your account settings and click Create to create the account.
Complete the Data Skills Challenge to compete for the leaderboard and earn a Microsoft Learn Badge.
Create a Database and a Container
Next, you’ll create a database and container within Azure Cosmos DB. Databases facilitate management, billing, and scaling, while a container is a schema-agnostic grouping of items (documents) with a partition key and a provisioned throughput. The partition property determines how the data is distributed across physical partitions for scalability and performance.
To create a database and container, follow the steps below:
-
From the Azure portal, navigate to your Azure Cosmos DB account and select Data Explorer on the left menu. In the Data Explorer, select New Database on the top menu.
-
In the Add Database panel, enter a name for the new database, like “ProductsDB.”
-
Check Provision database throughput if you want to enable shared throughput for the database. This shares the throughput (RU/s) you provision among all containers in the database. You can also activate or deactivate autoscale, which automatically adjusts the throughput based on your application’s usage patterns.
-
Select OK to create the database.
-
In Data Explorer, expand the ProductsDB database and select New Container on the top menu. Then, open the Add Container panel and create a new container:
-
Enter “Products” as the container name.
-
Enter “/ITEM_ID” as the container’s partition key. This will partition the data by its
ITEM_ID
property, since columns with a wide range of values make excellent partition keys. -
Use the default value of 400 throughput units. If you’d like, you can also deactivate autoscale for the container.
-
-
Select OK to create the container.
Populate the Container
Now that you’ve created your database and container, you need to populate them with some data. For this demonstration, you’ll use a CSV file that contains UK inflation data. The dataset contains over 100,000 rows of data representing 600 products sold in UK shops over 12 months.
To populate the container with this data, follow these steps:
-
Download the CSV file.
-
In the Azure portal, navigate to your Azure Cosmos DB account and select Data Explorer on the left menu.
-
In Data Explorer, expand the ProductsDB database and the Products container, and select Items.
Upload the CSV File
Now, upload the CSV file:
-
From the top menu, select Upload Item.
-
In the Upload Item panel, select Browse, and choose the CSV file you downloaded previously.
-
Select Upload to upload the file to the container.
-
After the upload finishes, you should see the items in the container, each representing a row in the CSV file. You can select an item to view its properties and values in JSON format.
Verify the Data in the Container
To verify that the data in the container is correct and consistent, you can use the SQL query editor in the Data Explorer.
-
Select New SQL Query.
-
The query editor lets you execute SQL queries against the data in the container. For example, run the following query to get the container’s item count:
SELECT VALUE COUNT(1) FROM c
You should get a result of
10000
, which matches the number of rows in the CSV file. -
You can also run queries to check the data quality and integrity, like the following:
-
Get the distinct values of ITEM_ID —
SELECT DISTINCT VALUE c.ITEM_ID FROM c
-
Get the average price of each product —
SELECT c.ITEM_ID, c.ITEM_DESC, AVG(c.PRICE) AS avg_price FROM c GROUP BY c.ITEM_ID, c.ITEM_DESC
-
Get the price trend of a product over time —
SELECT c.QUOTE_DATE, c.PRICE FROM c WHERE c.ITEM_ID = '210102' ORDER BY c.QUOTE_DATE
-
-
You can also use the built-in charts to visualize the query results. In the top-right corner of the query editor, select Chart and choose the chart type you want to use, such as line, bar, or pie.
Next Steps
In this article, you configured an Azure Cosmos DB database and populated it with data about product price changes. You also verified the data in the container using SQL queries and charts.
In the next part of the series, you’ll learn how to use Azure’s AI and ML capabilities to analyze the data and suggest product price forecasts.
If you want to challenge yourself and learn more about Azure, Cosmos DB, and AI/ML, we encourage you to participate in the Data Cloud Skill Challenge. You can also register for AKS Customer and Lab Days at the premier conference for cloud-native technologies, KubeCon EU 2024.