PostHeaderIcon Create a Pandas DataFrame from a MongoDB query

Create a Pandas DataFrame from a MongoDB query, we will leverage our knowledge of creating MongoDB queries to get the information that we want.

Before running a query against MongoDB, determine the information you want to look at. By creating a query filter, you will save time by only retrieving the information that you want. This is very important when you have millions or billions of rows of data.

Steps of it:
1. To create a Pandas DataFrame from a MongoDB query, the first thing we need to do is import the Python libraries that we need:

import pandas as pd
from pymongo import MongoClient

2. Next, create a connection to the MongoDB database and use the connection to select the database and collection to query:

client = MongoClient('localhost', 27017)
db = client.smallbusiness
collection = db.company

3. run a query and put the results into an object called data:

data = collection.find({"企业类型": "有限责任公司(自然人投资或控股)"})

4. Use Pandas to create a DataFrame from the query results:

company = pd.DataFrame(list(data))

5. Finally, show the top five results of the DataFrame:

print company.head()

all code of it:

import pandas as pd
from pymongo import  MongoClient

client = MongoClient('localhost', 27017)

db = client.smallbusiness
collection = db.company

data = collection.find({"企业类型":"有限责任公司(自然人投资或控股)"})

company = pd.DataFrame(list(data))

print company.head
print company.count

the result of it:

As we have seen in previous recipes in which we queried MongoDB, we create our query filter and then run it. The biggest difference is this bit of code:

company = pd.DataFrame(list(data))

Data is a cursor object. By definition, a cursor is a pointer to a result set. In order to retrieve the data that the cursor points to, you have to iterate through it. We create a new Python list object using (list(data)), which then iterates through the data cursor for us, retrieving the underlying data and filling the DataFrame.

11,097 views

Leave a Reply

Your email address will not be published. Required fields are marked *

*


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>


Copyright © 2010 - C++ Technology. All Rights Reserved.

Powered by Jerry | Free Space Provided by connove.com