-
Star
(255)
You must be signed in to star a gist -
Fork
(137)
You must be signed in to fork a gist
-
-
Save hugobowne/18f1c0c0709ed1a52dc5bcd462ac69f4 to your computer and use it in GitHub Desktop.
| class MyStreamListener(tweepy.StreamListener): | |
| def __init__(self, api=None): | |
| super(MyStreamListener, self).__init__() | |
| self.num_tweets = 0 | |
| self.file = open("tweets.txt", "w") | |
| def on_status(self, status): | |
| tweet = status._json | |
| self.file.write( json.dumps(tweet) + '\n' ) | |
| self.num_tweets += 1 | |
| if self.num_tweets < 100: | |
| return True | |
| else: | |
| return False | |
| self.file.close() | |
| def on_error(self, status): | |
| print(status) |
Does the line tweet_list.append(status) serve any specific purpose here?
The first time I ran the code, it threw a NameError saying that tweet_list doesn't exist. After removing that line, the code seems to run just fine.
I'm getting a NameError as well: "NameError: name 'tweet_list' is not defined". Anybody have a solution for this? I don't see where tweet_list is originally defined in the code.
@LincT @jakemore93 @wkkim-se excuse delay: i wasn't alerted of these messages until jake pinged me personally: thanks, jake!
tweet_list.append(status)does nothing and doesn't belong here; it is a relic from a previous version in which i saved to a list, rather than a file- @wwkim-se : you're right -- it doesn't close; one really should close that! A
whileloop would suffice, i think
Can you please tell me how to access past data from twitter. or send me link where i get all the details .print('Thanks'*10, 'in', "advance")
@wkkim-se of course not :-) Other than that, for the sake of correctness and memory leaks, self.file.write should be surrounded by a with context manager like,
# __init__()
self.file_name = "tweets.txt"
# on_status()
with open(self.file_name, 'w') as file:
file.write(json.dumps(tweet) + '\n')Hi! Can you explain what 'self' does? If you could explain the workings of this code in detail, it would be great! I'm a beginner at this. I'm not sure i understand the first part of the code or when the class gets called.Thanks!!
@Divkar94, Hugo explains it in the context of DataCamp's Importing Data in Python
@Divkar94 It's bit advance of python (Object Oriented Par / Class Part).
In simple self here is like a variable which will hold the object address when we will call any method of that class using the object (for that instance)
for example,
class X:
def adrs(self):
print(type(self))
print(id(self))
x1= X()
x1.adrs()
x2=X()
x2.adrs()
adding to that using self we can allocate value to that particular instance.
Hope you understand.
The file open mode in @plumps' code should be 'a' for appending, otherwise the previous contents will be overwritten with each new tweet.
Can't we just put the close() statement above the return statement in the else block? As this is written now we never hit the close statement, so I am confused how the solution is a while loop. Is there anything wrong with this?
else:
self.file.close()
return Falsedatacamp rules- especially hugo.
As per the comments from @wkkim-se @plumps and @hiliev I believe there are corrections required in this class for downloading streaming Twitter data. I am new to Python so not yet into creating classes. I have made the corrections as per my understanding and it appends the text file. Hope it is correct. Could you review below code and possibly correct the code here in Github since there will be many like me who will be confused and hunting for the solution!
class MyStreamListener (tweepy.StreamListener):
def __init__(self, api = None):
super(MyStreamListener, self).__init__()
self.num_tweets = 0
self.file_name = "tweets.txt"
#self.file = open("tweets.txt", "w")
def on_status(self, status):
tweet = status._json
with open(self.file_name, 'a') as file:
file.write(json.dumps(tweet) + '\n')
self.num_tweets += 1
if self.num_tweets < 100:
return True
else:
return False
def on_error(self, status):
print(status)
@hugobowne Thanks for sharing this code. I had a query related to Loading and Exploring the twitter data.
I was trying the below code on my laptop. However, it returned an error. Error message has been shown below:
Read in tweets and store in list: tweets_data
for line in tweets_file:
tweet = json.loads(line)
tweets_data.append(tweet)
Error Message: JSONDecodeError: Extra data: line 1 column 5703 (char 5702)
Could you please help me with this issue?
Nipun
Anyone else getting a '401' response from Twitter when you replace the mock access and consumer keys with your own?
This code is the child class, where is the parent class? Post it please
@eric-ahlgren I think it will work just fine
@plumps If it was on_error and not on_status, and the file was opened in on_status not in init, wouldn't the file close? And if it did close, the file was opened in "w" mode but not "a" , wouldn't the content be lost every time the file is reopened?
@hugobowne can you please add the correct code?
@strashynskyi thanks for pinging me. it looks like this the twitter API has changed so that this code doesn't run now. I don't have the bandwidth to go in and figure out what the correct code looks like. If someone else wants to, that would be great. I've made the following note in the description of this gist:
NOTE: this code is for a previous version of the Twitter API and I will not be updating in the near future. If someone else would like to, I'd welcome that! Feel free to ping me. END NOTE.
class MyStreamListener (tweepy.StreamListener):
def __init__(self, api = None):
super(MyStreamListener, self).__init__()
self.num_tweets = 0
self.file_name = "tweets.txt"
#self.file = open("tweets.txt", "w")
def on_status(self, status):
tweet = status._json
with open(self.file_name, 'a') as file:
file.write(json.dumps(tweet) + '\n')
self.num_tweets += 1
if self.num_tweets < 100:
return True
else:
return False
def on_error(self, status):
print(status)class MyStreamListener (tweepy.StreamListener): def __init__(self, api = None): super(MyStreamListener, self).__init__() self.num_tweets = 0 self.file_name = "tweets.txt" #self.file = open("tweets.txt", "w") def on_status(self, status): tweet = status._json with open(self.file_name, 'a') as file: file.write(json.dumps(tweet) + '\n') self.num_tweets += 1 if self.num_tweets < 100: return True else: return False def on_error(self, status): print(status)
Thanks
I’ve worked with similar scripts before, and handling rate limits and API errors is really important for something like this. Adding proper logging also helps a lot when debugging issues later. I used a similar setup when integrating data feeds into a website, and small tweaks made it much more stable.
Innovation is changing how businesses operate, making services more efficient, accessible, and customer-focused than ever before. From automation to smart digital solutions, technology continues to create new opportunities for growth and convenience. If you're interested in exploring modern trends in this space, learn more here.
This looks like a Twitter streaming listener script where tweets are collected in real-time and saved into a file, which is useful for data analysis or building datasets. Since it’s based on an older Twitter API version, it might need updates to work properly with the current API changes.
In general, handling and structuring data efficiently is important in many digital projects, especially when building user-focused platforms. I’ve seen similar emphasis on structure and user experience in different industries too, like on modern lifestyle experience platforms such as Socio HK, where everything feels well-organized and focused on delivering a smooth experience.
Working with older Twitter API code like this is still useful for understanding how streaming data collection and listeners work, especially for learning or maintaining legacy projects. Writing tweets to a file in real time is a simple but effective approach for data logging and analysis.
Even in completely different industries, structured data and presentation matter a lot—similar to how GOSSiP Hong Kong focuses on delivering a well-organized and engaging user experience through its digital presence and brand identity.
How does 'self.file.close()' get called when the prior if-else blocks returns?