Kafka/Mongo connectors Flashcards

1
Q

What is the Kafka syntax to start a connection with a consumer?

A

1) import kafka import KafkaConsumer
2) establish a producer object connector variable (i.e. KafkaConsumer)
3) send data via “.send” and producer.send(‘topic_name’, variable_name_of data)

E.G:

from kafka import KafkaConsumer
consumer = KafkaConsumer(‘TOPIC_NAME’,bootstrap_servers=’localhost:9092’,auto_offset_reset=’earliest’, enable_auto_commit=False)
producer.send(‘basketball’, variable_name)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How to connect to the Pymongo client?

A

1) import pymongo
2) call and save mongo client object
3) create a db (and collection) variable

from pymongo import MongoClient

myclient = MongoClient(“mongodb://localhost:27017”)
mydb = myclient[“taxi_db”]
mycol = mydb[“my_db”]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the pyodbc module?

A

General driver used for the oracle and java?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are some pymongo queries?

A

(MongoClient_variable.mydb_client_variable).insert_one(dict_obj)

(connector/database variable).insert_many({},{},{})

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the syntax in KafkaProducer?

A

1) Import the Kafka producer, and import ‘json’
2) Establish a producer object, and save it to a variable, with a serializer and the json.dumps (see code below)
3) send data via producers.send(‘TOPIC_NAME’, variable_for_storage)

KafkaProducer(bootstrap_servers=’localhost:9092’,value_serializer=lambda v: json.dumps(v).encode(‘utf-8’))

example for producer method:

def myprd():
    a = response.text.splitlines()[1:]
    print(a)
    producer = KafkaProducer(bootstrap_servers='localhost:9092',value_serializer=lambda v: json.dumps(v).encode('utf-8'))
    for b in a:
        print(b)
        producer.send("taxirides3", b)
    producer.flush()
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you insert data into a file that has a defined schema based file, and then read the file entries?

A

from avro import schema as sc
from avro.datafile import DataFileReader, DataFileWriter
from avro.io import DatumReader, DatumWriter

schema = sc.Parse(open(“kafka_schema.avsc”, “rb”).read()) # read schema from file
writer = DataFileWriter(open(“users.avro”, “wb”), DatumWriter(), schema)
writer.append({“name”: “Alyssa”, “favorite_number”: 256})
writer.append({“name”: “Ben”, “favorite_number”: 7, “favorite_color”: “red”})
writer.append({“name”: “Mahmuda”, “favorite_number”: 9, “favorite_color”: “blue”})
writer.append({“name”: “hb”, “favorite_number”: 256, “salary”:”12”})
writer.close()
reader = DataFileReader(open(“users.avro”, “rb”), DatumReader())

for user in reader: # reading the file entries
print (user)

reader.close()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is an example of an AVRO file system (metafile)

A

{
“namespace”: “users.avro”,
“type”: “record”,
“name”: “itdep”,
“fields”: [
{“name”: “name”, “type”: [“string”, “null”]},
{“name”: “favorite_number”, “type”:”int”},
{“name”: “favorite_color”, “type”: [“string”, “null”]},
{“name”: “salary”, “type”: [“string”, “null”]}
]
}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly