1000 friends of Pavel Durov: how to pump out VKontakte data

I once came across an interesting task - quickly and as simply as possible (with a minimum of involved libraries) to draw a graph of the intersection of users of different VKontakte groups. And I even did it. Then I thought - what if someone else needs to do something similar?

Therefore, here I will tell and show how to do such things:

1. Connect to the VKontakte website using Python 2.7 (you can always rewrite it under 3.x, if that)

2. Create graphs using the NetworkX library (here we consider graphs of intersections of VKontakte group audiences)

3. Visualize these graphs (here we will need the matplotlib library a little)

So, given:

Assignment: draw a graph of intersections of users of different VKontakte groups

Basic tools: Python 2.7, NetworkX, matplotlib, numpy

Go!

Connecting to the API

If you search the vast network well, you can find useful resources that make connecting to the VKontakte API much easier.

First you need to download a special wrapper library - vk.com (aka vkontakte.ru) API wrapper.

Then it is very advisable to download the module to simplify VKontakte authorization here (many thanks to the good person Alexkutsan for this script). We will call the module vk_auth and import it in every possible way in the future, so save it in the same directory where the main scripts will be located. If the link is not working, you can take vk_auth from me - write in the comments or send a request to me by email.

So, we have everything to quickly and easily connect to the VKontakte API. Let's check that we did everything right, and the contact now lets us into its depths.

#!/usr/bin/env python2 # -*- coding: utf-8 -*- import vk_auth import vkontakte def test(): #parameters for vk_auth.auth - your VKontakte login, your VKontakte password, application id, #and access zone, aka scope (token,user_id) = vk_auth.auth('your_login', 'your_password', '2951857', 'groups') vk = vkontakte.API(token=token) print "vk server time is ", vk.getServerTime() return 0

In this script you need to indicate your VKontakte login and password (don’t worry - this information will not leak anywhere, you can check it using the vk_auth script code), as the application id you need to indicate the identification number of your application, which must first be created on VKontakte like this, for example, way.

If the video is not very clear, then on the page vk.com/developers.php there is a hidden “create an application” button in the upper right corner. It doesn’t immediately catch your eye—it’s located just below the “music” and “help” menu buttons. There you can create an application and use its client_id as the application id.

But! You can also use someone else's ID. For example, I used this one - 2951857 , which was published in this article on Habrahabr. If suddenly the author has any complaints about the use of this identifier, I am ready to remove it from here.

Next, you need to specify the so-called scope - the access zone for our parser. Since we are going to parse groups further, it is worth specifying 'group'. You can read more about what can be specified as scope here

Everything else can be left as is. If everything is correct, then by executing the code shown above, we will get the VKontakte server time.

How to find out who a person is chatting with on VKontakte?

The safest way to find out who a user is chatting with is to ask him an appropriate question. Such an action will not require the use of any additional tools; does not violate the law or the rules for using the social networking site “VKontakte”. But often the real method raises suspicions about the person’s honesty or can be completely incorrect and offend the guy/girl.

There are no other legal and honest methods for obtaining an answer to the question under consideration. While the Internet can offer various methods in search results:

  • Third party services;
  • Online analyzer services;
  • Use of email;
  • Special programs for a mobile device;
  • Account hacking.

Below are two, in our opinion, the most effective and safe methods.

Online analyzer services

Such analyzer services differ from fraudulent services in that these platforms operate legally. They work online and help determine information with whom a “friend” is corresponding on VKontakte.

The most popular and frequently used online analyzer services are:

  • The VKDIA.com website is a free server that does not require registration. To work with this analytical tool, the user needs to know the ID of the “target object”. By entering the address to the account of the user of interest in the appropriate line and clicking on the “Start tracking” button, the process of collecting information begins. Subsequently, the system analyzes the time of entry into the social network and the moment of exit. Then it checks the friends of the “target object” at matching intervals. From which you can estimate the likelihood of an object communicating with another user.

    Illustration on the topic How to find out who a person is communicating with on VK: an overview of simple methods

  • The super-spy.com platform is one of the most thorough scanners of personal pages of the VKontakte social network. Thanks to its extensive functionality, the server monitors the user’s time online, tracks likes given, comments left, and posts reposted. Hence, provides you with complete activity reports.

    Illustration on the topic How to find out who a person is communicating with on VK: an overview of simple methods

  • The resource vk-fans.ru is a service whose functioning is based on publicly available information. This online analyzer collects all possible information about an object in a short period of time and draws its conclusions.

    Illustration on the topic How to find out who a person is communicating with on VK: an overview of simple methods

Special mobile applications

Currently, there are a large number of applications that guarantee users help in satisfying their own curiosity about the activity of friends on the social network. Such applications differ from other methods in that they are legal. The principle of their work is to collect data about the account of the object of interest and conduct analysis. The information that such applications collect includes:

  • Determining the activity of the “target object”, namely the time of activation of the “Online” mode and the person leaving his account;
  • Comparing activity data with the same indicators of his “Friends” and identifying matches;
  • Collection and analysis of received and given likes;
  • Tracking comments on the personal pages of other social network users or reposts.

Applications for Android OS that analyze the activity of VK users and make it possible to guess with whom they are corresponding are:

  • “VK Guests + Detective + Correspondence Protection”;
  • “Correspondence and guests”;
  • "I know";
  • "Spy for VKontakte."

The above list of spyware can be continued, since there is a lot of similar software available on the Google Play Market.

Illustration on the topic How to find out who a person is communicating with on VK: an overview of simple methods

Carrying out surveillance using mobile applications requires downloading and installing your favorite analytical tool from the application store. Launch it and enter the ID of the page you are interested in. An important nuance is that such utilities do not hack accounts, do not transfer private data or personal correspondence to third-party users, but only analyze available information.

By and large, special mobile applications are safe, but do not always provide reliable facts about the list of user conversations.

Parsim groups

Great! Everything worked out (at least it should have).

Now let’s try to get the necessary data on VKontakte groups - the number of participants for each group and a list of these same participants in the form of a list of IDs.

It is important to know that the VKontakte API displays a maximum of 1000 group users - you won’t be able to interrogate more from it. However, for carrying out a rough analysis of groups, it will do. If you need more, you will have to parse group pages directly.

The function below takes as input a list of VKontakte group names, and at the output gives the data we need for these groups.

#!/usr/bin/env python2 # -*- coding: utf-8 -*- import vk_auth import vkontakte import time def get_groups_users(groups_list): groups_out = {} (token,user_id) = vk_auth.auth('your_login' , 'your_password', '2951857', 'groups') vk = vkontakte.API(token=token) for group in groups_list: #here we specify count=10, which will give us 10 users from the group #this is done for clarity. A maximum of 1000 users can be pulled out groups_out[group] = vk.get('groups.getMembers', group_id=group, sort='id_asc', offset=100, count=10) time.sleep(1) return groups_out if __name__ == '__main__': group_list = ['oldlentach', 'obrazovach', 'superdiscoteka'] print get_groups_users(group_list) >>> {'oldlentach': {u'count': 740868, u'users': [1405, 1418, 1443, 1444, 1447, 1481, 1491, 1494, 1500, 1509]}, 'obrazovach': {u'count': 217978, u'users': [3172, 3192, 3196, 3213, 3317, 3328, 3331, 3356, 3361, 3420]}, 'superdiscoteka': {u'count': 150538, u'users': [20470, 20479, 20536, 21977, 22426, 22522, 22613, 22881, 23207, 23401]}}

The structure of the output data is as follows: the key is the name of the group, the value is a dictionary with two keys: u'count' - the number of members in the group and u'users' - a list of IDs of the members of this group (maximum 1000, as we remember)

The name of the group is taken from its VKontakte address, for example, there is a group called Obrazovach, which is located at https://vk.com/obrazovach and we take the last part of the address, i.e. "obrazovach" as a group name .

Building a social graph

Now let's move directly to building a social graph.

Here we will use the NetworkX library, which is excellent for compiling, analyzing and visualizing graphs.

This is how you can create a graph for VKontakte groups:

  1. The input is a dictionary, where the key is the name of the VKontakte group, and the value is the number of members of this group and a list of a maximum of 1000 group member IDs (each ID is a VKontakte user ID)
  2. We create a vertex in the graph for each group, assigning as an attribute to each vertex a weight equal to the number of participants in the group
  3. Then, for each pair of vertices, if there is an intersection between them according to the IDs of the participants, an edge is created, and as an attribute we assign to each edge a weight equal to the number of users who are present in both groups.

The attributes that are assigned to vertices and edges are necessary for subsequent visualization. The greater the weight of a vertex (the number of participants in the group), the larger the size of the vertex in the diagram. The greater the weight of the edge (the intersection of the number of participants), the thicker the edge will be in the diagram.

This function builds the graph described above:

#!/usr/bin/env python2 # -*- coding: utf-8 -*- import networkx as nx def make_graph(groups_out): graph = nx.Graph() groups_out = groups_out.items() for i_group in xrange( len(groups_out)): graph.add_node(groups_out[i_group][0], size=groups_out[i_group][1]['count']) for k_group in xrange(i_group+2, len(groups_out)): intersection = set(groups_out[i_group][1]['users']).intersection(set(groups_out[k_group][1]['users'])) if len(intersection) > 0: graph.add_edge(groups_out[i_group] [0], groups_out[k_group][0], weight=len(intersection)) return graph

Let's visualize

Visualization is carried out using NetworkX methods based on matplotlib. You can read more about how to visualize a graph here. And here is an example of how exactly the graph that we created above is visualized. The parameters of the methods speak for themselves, so I believe no additional explanation is required

Rating
( 1 rating, average 4 out of 5 )
Did you like the article? Share with friends: