Egyptian Informatics Journal (2016) 17, 21-43
Cairo University Egyptian Informatics Journal
www.elsevier.com/locate/eij www.sciencedirect.com
REVIEW
Analyzing negative ties in social networks: A survey
CrossMark
Mankirat Kaur *, Sarbjeet Singh
UIET, Panjab University, Chandigarh, UT, India
Received 20 February 2015; revised 18 July 2015; accepted 13 August 2015 Available online 26 September 2015
KEYWORDS
Centrality; Eigenvector; Graph complement; Negative cohesive subgroups; Negative ties
Abstract Online social networks are a source of sharing information and maintaining personal contacts with other people through social interactions and thus forming virtual communities online. Social networks are crowded with positive and negative relations. Positive relations are formed by support, endorsement and friendship and thus, create a network of well-connected users whereas negative relations are a result of opposition, distrust and avoidance creating disconnected networks. Due to increase in illegal activities such as masquerading, conspiring and creating fake profiles on online social networks, exploring and analyzing these negative activities becomes the need of hour. Usually negative ties are treated in same way as positive ties in many theories such as balance theory and blockmodeling analysis. But the standard concepts of social network analysis do not yield same results in respect of each tie. This paper presents a survey on analyzing negative ties in social networks through various types of network analysis techniques that are used for examining ties such as status, centrality and power measures. Due to the difference in characteristics of flow in positive and negative tie networks some of these measures are not applicable on negative ties. This paper also discusses new methods that have been developed specifically for analyzing negative ties such as negative degree, and h* measure along with the measures based on mixture of positive and negative ties. The different types of social network analysis approaches have been reviewed and compared to determine the best approach that can appropriately identify the negative ties in online networks. It has been analyzed that only few measures such as Degree and PN centrality are applicable for identifying outsiders in network. For applicability in online networks, the performance of PN measure needs to be verified and further, new measures should be developed based upon negative clique concept.
© 2015 Production and hosting by Elsevier B.V. on behalf of Faculty of Computers and Information, Cairo University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.
org/licenses/by-nc-nd/4.0/).
Corresponding author. Tel.: +91 9463543880. E-mail addresses: mannsunshine09@gmail.com (M. Kaur), sarbjeet@ pu.ac.in (S. Singh).
Peer review under responsibility of Faculty of Computers and Information, Cairo University.
http://dx.doi.org/10.1016/j.eij.2015.08.002
1110-8665 © 2015 Production and hosting by Elsevier B.V. on behalf of Faculty of Computers and Information, Cairo University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Contents
1. Introduction........................................................................................................................................................22
2. Standard methods................................................................................................................................................23
2.1. Analyzing ties through equivalence concepts....................................................................................................23
2.1.1. Structural equivalence..............................................................................................................................24
2.1.2. Regular equivalence................................................................................................................................25
2.1.3. Automorphic equivalence........................................................................................................................26
2.2. Analyzing ties through statistical techniques....................................................................................................26
2.3. Analyzing ties through centrality measures ......................................................................................................26
2.3.1. Degree centrality....................................................................................................................................27
2.3.2. Betweenness centrality............................................................................................................................29
2.3.3. Closeness centrality................................................................................................................................30
2.3.4. Eigenvector centrality..............................................................................................................................31
3. Influence measures................................................................................................................................................32
3.1. Katz measure................................................................................................................................................32
3.2. Hubbell measure............................................................................................................................................32
3.3. Bonacich power measure ................................................................................................................................33
4. Projected approaches for analyzing ties..................................................................................................................35
4.1. Graph complement approach..........................................................................................................................35
4.2. Negative cohesive subgroups..........................................................................................................................35
5. New measures of negative ties..............................................................................................................................36
5.1. Negative degree centrality..............................................................................................................................36
5.2. h centrality..................................................................................................................................................37
6. Mixed data measures............................................................................................................................................37
6.1. Degree centrality measure..............................................................................................................................37
6.2. Status measure ..............................................................................................................................................37
6.3. Political Independence Index (PII)..................................................................................................................38
6.4. PN centrality measure....................................................................................................................................39
7. Strengths and weaknesses of measures....................................................................................................................39
8. Conclusion and future scope ................................................................................................................................41
References ..........................................................................................................................................................41
1. Introduction
Online social networks are becoming popular among large number of people, as a source of forming virtual communities online. These communities are developed by creating profiles and maintaining personal contacts of each user through social interactions. Schneider et al. [1] defined social network as ''OSNs form online communities among people with common interests, activities, backgrounds, and/or friendships. Most OSNs are Web-based and allow users to upload profiles (text, images, and videos) and interact with others in numerous ways". Social interactions in these networks are crammed full with positive and negative relations. Positive relations are formed by support, endorsement and friendship and thus, create a network of well-connected users which is useful for the promotion of market products, brands, services and new research ideas on social media [2]. Negative relations, on the other hand, are a result of opposition, distrust, antagonism and avoidance. Negative relationships represent a persistent, recurring set of negative social intentions toward another person. Negative relations are central to clusterability theory [3], balance theory [4,5], their generalizations [6], semigroup work [7] and social network analysis [8]. Many areas of negative networks need extensive research such as bullying, group conflicts and social exclusion. Because of the increase in criminal activities in online networks such as people masquerade others
by creating fake profiles and conspire against business rivals by presenting fake reviews of products being promoted on these networks, there is a need to identify these negative actors of network by observing their connection patterns. In another example, one political party opposes the policies and efforts made by other parties on these networks by commenting against them and exaggerating their own policies by pretending them to be in welfare of public. Therefore, these undesirable activities of political parties need to be monitored.
Many researchers have focused on studying negative ties in different fields of research. Bohn et al. [9] have analyzed the access to social capital in OSNs by examining the actual communication ties (positive and negative) among actors, whereas, Box-Steffensmeier and Christenson [10] studied the interest group coalition by considering positive and negative ties. Smith et al. [11] have accessed the phenomenon of homophily in social networks by studying network ties of both positive and negative nature. De Jong et al. [12] analyzed the impact of negative relationships on attitudes of each team member and performance in teams. Recently, many researchers have dedicated their research on investigating negative ties to explore different concepts in social networks such as studying attitude diffusion [13] and analyzing aspects of inter-ethnic relationships between secondary school students
The interest of this paper is in the relations that are themselves negative rather than negative consequences of positive relations. There are many standard datasets depicting such kind of relations such as, skirmish relation in ''bank wiring room" data reported by Roethlisberger and Dickson in 1939 [15]; enmity ''hina" relation reported by Read in 1954 [16] of social network of tribes of the Gahuku-Gama alliance structure of the Eastern Central Highlands, New Guinea; and the disesteem and disliking relations among group of monks reported by Sampson in 1969 [17]. The same types of relations are also present in today's online networks such as Wikipedia, Epinions, and Slashdot which are online rating websites. In Wikipedia, users can vote for or against other users in elections to nominate them for admin status. If one user gives supporting or opposing vote to another user, then it leads to creation of positive or negative link respectively [18]. Epinions.com is a common consumer review website which forms an online social trust network where users create signed relations of trust or distrust with each other. Members of site can give positive or negative ratings to products of website as well as rate the reviews given by other members [19,20]. The visitors of site can check new and old reviews of product and then decide which product to purchase. The links between nodes and persons who give reviews about products are explicitly labeled as positive or negative. Slashdot is a technology related news website that entails specific user community. The news related to current technology is submitted by users, in which editors of site evaluate and provide open discussion among readers. Each news article also has section for comments attached to it, where users of site post their comments and lead to open threaded discussion. The comments associated with each news article are then rated by editors of site by using moderation system. The moderators rate each comment as +1 or —1 [21-23]. In 2008, Slashdot launched a new feature known as Slashdot Zoo which allows users to tag other users as friends (positive link) or foes (negative link). All these represent negative ties between actors in a group. It has been found that negative ties are less in number and are generally treated in the same way as the positive ties in most of the concepts such as blockmodeling analysis [24-26].
Now, question arises that whether the standard methods used for analyzing positive ties in networks can also be used for studying negative ties or not? This involves systematic analysis of standard techniques and changes in interpretation of these techniques to make them applicable for negative ties. There ascends many concerns when dealing with negative tie data:
• Transitivity: In positive relations, ideas or information is quickly delivered to group of nodes, directly or indirectly connected to a node usually covering distance of two or more length path depicting high level of transitivity, whereas in negative relation, flow does not follow this pattern. There is hardly a diffusion of information to more than one length path showing low level of transitivity.
• Sparseness: Positive ties form very dense networks but negative networks are very sparse and usually form highly disconnected graphs which make centrality approaches (to find most central and influential node in network through which most of the traffic flows) difficult to apply.
For example, in positive network, information can easily diffuse from m to p through directed path m ? n ? p where m is connected to n which is further connected to p but this is not possible in negative networks. Because of these few reasons, not all the standard approaches are applicable to negative relations.
Therefore, in this paper many new recent techniques that are used for examining only negative ties are discussed. For example, by using any strong negative node centrality measure, we can analyze whether a person is chosen or ignored for promotion in trust network of organization based upon high or low score calculated through this measure [27]. Similarly, potential victims of bullying can also be identified through scores calculated by these measures and can plan interpolation in early stages. In practical situations, both the positive and negative ties exist in social networks. So, in order to analyze the effect of these ties jointly on centrality scores of actors, we must have measures that can examine both types of ties simultaneously. Therefore, various such types of techniques are explored, analyzed and compared on the basis of various factors such as weightage of direct and indirect links, concept used for analyzing ties and complexity. The focus of this paper is to survey the social network analysis approaches for exploring the techniques that are suitable for identifying the outsiders or negative nodes of network. The study of identifying negative ties till date is carried out on small datasets of offline networks. Analyzing these ties on datasets of online social networks is still an unexplored area.
This paper is organized as follows. Section 2 discusses the standard methods of analyzing ties and checks their applicability on negative ties; Section 3 consists of studying negative ties through influence measures. Some new projected approaches for interpreting negative ties by using standard measures are discussed in Section 4. New measures of negative ties are analyzed in Section 5 followed by discussion and comparisons of various mixed data measures in Section 6. The strengths and weaknesses of all the discussed measures are summarized in Section 7. Finally, the paper is concluded with some new future directions in current field. An attempt has been made to cover important studies made in this field but study in this paper shall not be considered exhaustive in any sense.
2. Standard methods
2.1. Analyzing ties through equivalence concepts
Network is defined as a set of ordered pair of nodes exhibiting relations. It is depicted by a graph, more generally a digraph (directed graph). Relation is a specific type of tie existing between nodes of graph usually represented by aRb and path of tie is from a to b. To trap the intermingling of role relations in social process or structure, the social ties are examined as follows [28]:
a. Generation and elimination of social ties are commenced by other ties existing in network.
b. The impulses generated by individual nodes in network lead to activation of social process (communication among population).
c. The nature of tie between pair of persons or nodes depends upon pattern that is followed by ties of that type in network.
Symmetry is an important property of social relations which says that for any nodes m, n and relation R, if mRn then there must exist nRm. Friendly relations in social network are also symmetric that is ''Receiving and acknowledging the friendly expectations projected to one by friend, is treated as an intrinsic part of one's own projection of same friendly expectations to him, so long as all pairs in the population are symmetric in this way" [28]. Symmetry is also present in negative relations. Equivalent nodes in social network can be described as nodes that are having similar pattern of relations with other nodes of network and thus playing same roles and positions. There are primarily three types of role or positional equivalence exists in network as briefed below.
2.1.1. Structural equivalence
Different authors have defined structural equivalence of nodes in different ways. Lorrain and White [28] gave categorical approach in which relations between the nodes that can be compounded are called morphisms. Different types of relation between nodes are coupled to form compound relation, for example, one's sister's friend or one's batch mate's friend, and these are composite of two relations R and S where R is sister in first case and batch mate in second case and S is friend in both cases. The compound relation for any nodes a, b, c is defined as a(RS)c, if and only if aRb and bSc exist, where, R and S relations are morphisms. Composition operations on R and S to form RS and its graph on the set of nodes together are known as a category. Category C constitutes 2 types of classes CObj and CMor. Elements of class CObj are the nodes in network also called objects and elements of CMor are the morphisms exist between pair of nodes. They defined concept of category as composed of three stages: the objects, the mor-phisms and composition of morphisms. On the basis of this concept, Lorrain and White [28] defined Structural Equivalence as ''Objects a, b of category C are structurally equivalent if, for any morphism M and any object x of C, aMx if and only if bMx, and xMa if and only if xMb." This means any nodes a, b are structurally equivalent if node a relates to any other node x in the network with relation M which also exists between b and x. Thus, both the nodes a, b are completely equivalent and thus are substitutable [28].
Another definition of structural equivalence was given by White and Reitz [25] in which they coined a concept of homo-morphism as the mapping of points or nodes of graph to the actors or nodes in the image of the graph. They defined structural homomorphism as ''There are some types of roles in networks in which it is expected that all the occupants of that role should be identically connected to occupants of counterpart's role" [25] i.e. nodes a and b must belong to role 1 if they are having same image after mapping through function f:
x = f(a) — f(b)
Suppose, if a node a is related to a node, say d and b are related to node e, through a relation R then:
(belongs to one role) relation (counterpart's role)
Also d, e are counterpart roles of a, b if aRd, aRe, bRe, and bRd exist in network. Now structural homomorphism states that all the members of role a, b should be related to all the members of counterpart's role d, e as well as to their own members [25]. This can be represented pictorially as follows (see Fig. 1).
The equivalence brought by structural homomorphism is structural equivalence. Structural equivalence is then defined as ''If G = (P, R) where P is the finite set of points or nodes of graph G and R is the relation on set of ordered pairs of points or nodes, and = is an equivalence relation on P then = is a structural relation if and only if for a, b, c e P where a — b — c, a = b implies [25]:
i. aRb if and only if bRa;
ii. aRc if and only if bRc;
iii. cRa if and only if cRb and
iv. aRa implies aRb."
According to this definition, structurally equivalent nodes are related to each other with same relation R, which also relates them to other nodes of network.
Yet, third definition is given by Everett and Borgatti [26] as two vertices or nodes a and b of digraph are considered structurally equivalent if and only if N(a) = N(b) and No(a) = No(b). Here, N(a) is the in-neighborhood of vertex a, i.e. a set of vertices or nodes from which a receives connections; No(a) is the out-neighborhood of vertex a i.e. the set of vertices which receives connections from a. They described that if in and out-neighborhood set of any two vertices is same then those vertices are structurally equivalent. All the three descriptions explained same concept which can be briefed as: ''Any nodes m, n are structurally equivalent if both nodes are having same type of ties to same nodes in network."
Clearly, this equivalence of nodes can possibly occur in positive ties network such as school environment in which two teachers say, A and B teach same set of students {E, C, D} in same school, then A and B are structural equivalent nodes in network of school with positive tie existing between teacher and students. Here, if we interchange their positions as they are teachers in same network of students there is no effect on structure of network. In case of negative ties such as bullying, it is possible to have any nodes A and B bullying a
As well as d, e belong to role 2 as:
■y — f(d) —f(e)
Figure 1 Structural homomorphism, a b belongs to one role and d and e are counterpart's role.
group of students {M, N, P} who are potential victims. Now, A and B are considered to be structurally equivalent as they are having same ties with same group of students and {M, N, P} are structurally equivalent to one another as victims of bullying having abhorrent relation with candidates of bullying {A, B}. These nodes A and B are expected to have similar roles and positions in the network. Thus this equivalence is also possible in negative ties network.
2.1.2. Regular equivalence
This concept is used in graph theory for nodes which are showing equivalence in terms of their connections and relations to other nodes or actors in a graph. Many authors have defined regular equivalence by considering different concepts used in graph. This term was first used by White and Reitz [25] using concept of homomorphism. They stated regular homomor-phism as ''In role systems it is expected that all the occupants of one role should be identically connected to occupants of their counterpart's role" [25]. Here, each member should be connected to at least one member of counterpart's role but condition of connection to every member of counterpart role, as in case of structural homomorphism, is not necessary in this type. In continuation with previous example, regular homo-morphism only requires:
a —> d
b —> e
Cross connection of a to e and b to d in this mapping is not necessary. The equivalence brought by regular homomorphism is regular equivalence which was defined as ''If G = (P, R) where P is a set of points and R is relation between ordered pair of nodes, = is an equivalence relation on P, then = is a regular equivalence if and only if for all a, b, c e P, a = b implies [25]:
i. aRc implies there exists d e P such that bRd and d = c and
ii. cRa implies there exists d e P such that dRb and d = c."
According to this, points in graph are considered regular equivalent points if they are related to their corresponding equivalents with same relation R. Everett and Borgatti [26] used concept of regular coloration to highlight various results known about regular equivalence. They defined directed graph or digraph as d(v, e) where v is set of vertices and e is set of edges and also introduced in-neighborhood, out-neighborhood of vertex j as Nt(j) and No(j). Coloration in digraph D was described as allocating different colors to vertices. Regular coloration exists if and only if for all m, n e v: C(m) = C(n) implies [26]
i. C(N(m)) =
ii. C(No(m))
C(Ni(n)) and C(N0(n))
That is, if the color of vertices in digraph is same then its in-neighborhood and out-neighborhood must have same set of colors. Such coloration is regular in nature and brings regular equivalence among nodes of graph. They also used image digraph d0 to generate set of equivalent classes which contain regular equivalent vertices. The image digraph is d (C(v), e') where C(v) is a set of colors of vertices in d as labels of vertices
in d0 and e0 are set of edges that exist between pair of vertices if and only if nodes of that colors in d are adjacent to each other [26]. For example, consider a graph of five vertices connected to each other through directed links in Fig. 2. Vertices with overlapping set of in- or out-neighborhood vertices are colored with same color and rest with different colors. The name of vertex is written inside circle and color of vertex is written outside. Above coloration is regular as the colors of in-neighbors of 4 and 2 are red and yellow and those of out-neighbors are yellow for both. Similarly, colors of in-neighbors of 1 and 5 are red and yellow and colors of out-neighbors are green and red for both. Image digraph D is shown in Fig. 3.
There is an edge from vertex red to green because node 1 is connected to node 2 and node 5-4. This image digraph divides 5 vertices into set of 3 equivalent classes in which every node is regularly equivalent to each other (for example node 1 and node 5).
Both the statements explained that regular equivalent points or vertices in graph are the one that are connected to their corresponding equivalents with single relation R. In first case, when R is a positive tie, like father and son relationship, then possibly there can exist regularly equivalent nodes or persons a and b belonging to regular set of fathers such that a is a father of c and b is father of d, where c and d are the members of regular equivalent set of sons. In case of negative tie network where nodes are negatively connected, like gossip network at workplace in which A gossips about C and B gossips about D due to some conflicts over a project directed
Figure 2 Colored graph with name of vertices written in circles and color of vertex written outside circle. Regular equivalent nodes are colored with same color.
Figure 3 Image digraph with name of vertices written inside ovals is colors of vertices of digraph written outside ovals.
by all of them. Here A and B are both acting as regular equivalent nodes having similar type of relations to their equivalents C and D in a group.
2.1.3. Automorphic equivalence
The concept of automorphism was first used by Lorrain and White [28] in context of social networks. They used the term endomorphism, which are self-loops in directed graph d(v, e) representing the node's consciousness about its locus in social network structure. Endomorphisms were used as an important notion to identify group of nodes with which a node in network is most likely to be associated [28]. The more desired statement was given by Everett and Borgatti [26], ''Automorphism is any permutation n on set of vertices v so that adjacency will remain preserved after permutation", i.e. (m, n) e e (set of edges in d) if and only if (P(m), P(n)) e e. Consider the graph in Fig. 2, P(1) = 5, P(2) = 4, P(4) = 2, P(5) = 1. Then automorphic equivalence can be stated as ''two vertices m and n of digraph D are considered automorphically equivalent if and only if there exists an automorphism n such that n(m) = n" [26].
This notion of equivalence can be used in positive network, consider two branches of school as social network with A as a principal of group of teachers {M, N, P} in first branch and B as a principal of group of teachers { U, V, W} in second branch. D is the director of both branches of school. A and B are automorphically equivalent to each other as shown in Fig. 4. If director decides to move A to branch 1 and B to branch 2, then this transfer cannot take place without exchanging group of teachers in order to preserve relations in structure of network. Similarly, in negative ties network of bullying, it is possible that nodes A and B bullying two different groups of pupils are considered automorphically equivalent and if we permute their positions in network along with group of pupils the structure of network remains unchanged.
From this discussion it is clear, that all general concepts and theories of equivalence of nodes are applicable to both positive and negative ties and help in analyzing the pattern of flows and structure of nodes in network.
2.2. Analyzing ties through statistical techniques
As all equivalence concepts are equally applicable to both types of ties, there exist other methods known as statistical
Figure 4 Network of School branches, in which nodes represent the designation at different levels.
techniques such as QAP correlation [29], regression [30], and ERGM. that are used to check the similarity or correlation among various measures. These approaches could be used for analyzing both types of ties but the model required for correlation may differ. For example, positive relation of friendship with colleagues within organization can be positively correlated with sameness of states to which they belong as well as with promotion on job as these colleagues may help in providing critical information and references. Similarly, negative relations of disliking with colleagues at job can be positively correlated with demotion in organization structure as colleagues may provide negative feedback about a person to higher authority. However, the sameness of state to which they belong would not necessarily be positively correlated with negative relations. The only difference between the required models of correlation of both ties is promotion and demotion of an employee. Rest all factors such as sameness of language, state or region, and age can be modeled in same manner for both negative and positive relations. The exponential random graph model (ERGM) framework is also suitable for negative ties but the sparseness of negative data matrix causes problem in calculation of correlation and makes ERGM software packages less relevant [31]. Therefore, different models or configurations should be developed for better analyzing negative tie networks.
2.3. Analyzing ties through centrality measures
The word 'centrality' refers to the central element, item or actor in any domain of knowledge. The usage of term central-ity in different domains by researchers refers to finding important elements which reflect significant properties of domain. The centrality was used in human communication paths firstly by Bavelas [32]. The research experiment of centrality was conducted at Group network Laboratory, M.I.T. in 1940s by Leavitt [33] and Smith [34]. The results showed that efficiency of group in problem solving is affected by centrality [35]. Despite of its application in group problem solving, it was also used by Cohn and Marriott [36] to integrate large, heterogeneous and diverse social culture of country such as India and to administer its social cultures politically. After this, the concept of centrality was analyzed by Pitts [37] for its usage in urban development by reconstructing river transportation network in central Russia. Centrality can also be used in design of large organization by combining central units of various small organizations to form one large unit as examined by Beau-champ [38] and Mackenzie [39]. Further Rogers [40] suggested that centrality for organization could be computed from the characteristics of the firm and properties of communication network of firm. Beside these application areas, concept of cen-trality plays an important role in social networks. It was used to find most central and influential node in network through which most of the traffic flows and this node controls what type of information should diffuse into network. The measures to find centrality of node depend upon structural properties of network and they make use of flows to examine these characteristics [41].
Various terms are used in centrality measures that can be briefed as follows:
Consider Fig. 5: it depicts a graph consisting of labeled points and edges. This graph can be visualized as a social
Figure 5 Graph of a network with vertices depicting nodes and edge referring to communication link between the pair of nodes.
network in which persons are connected to each other through edges of graph.
i. Node: a point in a graph corresponds to each person or actor in network. Like 1, 2, 3 etc.
ii. Degree: maximum number of connections from a given point to all other points in network. For example, degree of point (3) = 3 and point (1) = 4.
iii. Link: an edge in a graph corresponds to each communication link that connects a pair of persons in network.
iv. Path: sequence of edges between pair of points. For example: path between 5 and 2 is 5 ? 4 ? 3 ? 2, and these three edges constitute a path.
v. Cycle: a path starting from and ending at same point like 1 ? 2 ? 3 ? 1.
vi. Connected graph: When each point is reachable from every other point in graph.
vii. Geodesic: a shortest path between a pair of points, like four paths between 1 and 4 exist: 1 ? 6 ? 5 ? 4; 1 ? 2 ? 3 ? 4; 1 ? 3 ? 4; 1 ? 4. From these paths last one 1 ? 4 is geodesic.
There are broadly two types of centralities categorized by Freeman [41]:
a. Point centrality: Point centrality can be described by central position of star or hub which is considered as the most central position possible in any network, like position of node 1 in Fig. 6. According to Freeman ''A person located in the center of a star is universally assumed to be structurally more central than any other person in any other position in any other network of similar size" [41]. And thus it came out that central point of star should possess three properties to become most central, that are [41]:
i. It should have maximum degree in the network.
ii. It should lie on maximum possible number of geode-sics between different pair of points in network.
iii. It should be maximally close to all other points by locating at minimum distance from them.
b. Graph centrality: Graph is basically a set of points depicting a social network. The centrality concept is extended to be used in graph and is associated with compactness of graph. Freeman defined compactness as ''A
Figure 6 Star network of nodes. Node 1 has highest value of centrality possible in any network of nine nodes.
graph should be compact to a degree that the distances between pair of points are short and this graph theoretic conception is extended to social networks and renamed as graph centrality" [41]. Graph centrality measures are based upon difference between centrality of other points and the most central point in the network or graph. Freeman [41] suggested that measures of graph centrality must contain some features:
i. They must include the value by which centrality of most central node exceeds the centrality of other nodes.
ii. They must express the ratio of computed excess to the maximum possible value of differences in centrality of nodes.
Expression of measures of graph centrality must satisfy the following condition:
elw) - Cxjpi) maxYLicx(p*) - Cx(pi)
largest value of cen-
Cx(pi) = centrality of node i. CXp*) trality of any node in network.
Denominator gives the maximum value of sum of difference between centralities for a graph of n nodes. Both of point and graph centralities can be computed by using three measures: Degree, Betweenness and Closeness.
2.3.1. Degree centrality
Point centrality was first defined in terms of degree of a point as ''count of number of points to which a given point is connected or adjacent to". Referring to Fig. 5, degree of node 3 is three, degree of node 5 is two and so on. This conception of degree was introduced by many researchers as mentioned in Table 1. They all defined degree as point centrality in terms of human social networks as ''person who permits direct connection to most other persons in social networks should be seen as focal point of communication where information of network flow through" [37,40,42-49]. But this definition was further realized useless as it did not consider graph size or longest geodesic in the graph that can be used to compare point centrality score over different graph sizes. Nieminen
Table 1 Comparisons of different measures of centrality.
Type of centrality
Proposed by
Year Field
Description of measure
Type of flow
Type of walk structure
Degree centrality
Betweenness centrality
M.F. Shaw
C. Faucheux and S. Moscovici W.L. Garrison
K.D. Mackenzie F.R. Pitts
D.L. Rogers J.A. Czepiel
J. Nieminen
Y. Kajitani and T. Maruyama
L.C. Freeman
A. Bavelas M.E. Shaw
B.S. Cohn and M. Marroit L.C. Freeman
Closeness centrality A. Bavelas
M.A. Beauchamp G. Sabidussi R. L. Moxley and N. F. Moxley D.L. Rogers L.C. Freeman
Eigenvector centrality P. Bonacich
1954 Analyzing group structure 1960 Social influence 1960 Interstate highway 1966 Communication networks 1965 Communication paths in urban
development 1974 Inter-organizational relations 1974 Diffusion of technological innovation
1973 Centrality in graph
1976 Assessment of communication
networks 1979 Centrality in social networks
1948 Group structure 1954 Behavior of individual in small group
1958 Integration of Indian Civilization 1979 Centrality in social networks
1950 Communication patterns in groups
1965 Efficiency of organization
1966 Centrality index in graph 1974 Uncontrived social network
1974 Inter-organizational relations 1979 Centrality in social networks
1972 Social networks
It defines centrality as the number of direct links of 1-length that a node has with their neighboring node
Concerned with communication activity in network
It defines centrality as number of shortest paths between pair of nodes on which a given node lie
It defines centrality of a node as the inverse of sum of minimum shortest paths from all other nodes in network
It defines centrality of node as proportional to the sum of centrality of neighboring nodes
Parallel Duplication. For Example: E-mail Broadcast, Attitude influencing, Money Exchange
Path, Trail, Walk
Concerned with control of communication
Concerned with independence and efficiency in spreading message
Concerned with Influencing System
Transfer e.g. Packet Delivery system
Transfer e.g. Packet Delivery
Geodesic
Geodesic
Parallel Duplication Path e.g. gossip process Trail Walk
Parallel duplication e.g. Attitude Influencing
Unrestricted walks
Four measures of centrality are compared on the basis of their usage in performing specific activities, type of flow they follow and type of walk structure permitted by each measure to flow things through network.
os Cr*
[48] provided definite measure of degree centrality in terms of number of adjacencies as
CD(Pk) = a{-Pi; Pk)
where a(pi, pk)=1 if pi and pk are connected else 0. This measure is partially a function of size of network. But the absolute count of activity of node may not be desirable for most of applications. So, in order to remove effect of network size, it was normalized by Freeman [41] to get relative centrality i.e.
J2ha(Pi, Pk)
Degree centrality indicates the potential communication activity of a node in network. Graph centrality in terms of degree of graph was given by Freeman [41] as
YLACd(P') - CD(P,)] n2 — 3n + 2
The maximum possible value of denominator is (n — 2) (n — 1) = n2 — 3n + 2 where n is the number of nodes in network. Degree centrality is the measure to calculate immediate risk of getting infected in network of infected nodes in one time period from directly connected infected node [50]. As shown in Table 1, degree measure is used in parallel duplication flows and thus appropriate for walk based processes [51]. In positive tie network, degree measures are used to calculate popularity of node i.e. the node that is receiving more number of positive ties from other nodes in network (high in-degree) and sending many ties to other nodes (high out-degree) is the most liked person in network. Similarly, interpretation can also be made in negative ties network of disliking relationships to identify the most disliked individual in the network who is receiving more number of negative ties from all other individuals. Degree based graph centralization can be used in negative tie network to find the decrease in group cohesion due to the presence of negative relations of disliking and distrust. Degree cen-trality is the only measure out of three that can be used in both positive and negative networks because degree of a node is independent of flows and calculated on the basis of number of direct connections. However, this measure cannot calculate the centrality possessed by a node due to the presence of other nodes at more than one length path [31].
2.3.2. Betweenness centrality
Betweenness as a measure of point centrality can be stated as the number of counts with which a point lies in between the shortest paths or geodesics linking pairs of points. This measure is based upon assumption that information flows only along shortest paths connecting pair of points in network [41]. Such a node falling on most of the geodesics in network is considered to be central and controls transmission of information in group thus acts as its coordinator [36,52] For example in Fig. 6, in star network center node 1 is the most central node as it lies on all the geodesics connecting all pairs of nodes in network. Betweenness of a point can be determined easily but when there exist multiple geodesics between pair of points then partial betweenness is used in the form of probabilities [41]. For example there are two geodesics between nodes 2 and 5 in Fig. 5: 5 ? 4 ? 3 ? 2; 5 ? 6 ? 1 ? 2. Here, probability of using any path out of two is 0. Let's take random node
say, 3. The probability of this point to occur on randomly selected geodesic path between 2 and 5 is 0 1. This can be mathematically written as
aij(Pk) =
mij(Pk
where m¡j is number of geodesics between points i and j and mij(Pk) is the number of geodesics between i and j that contain point pk. Now, betweenness centrality of point pk can be written as [41]
(3) CB(Pk) = ^^2aij(Pk
i=1 j=1
where i < j, n is the number of nodes in network. CB(pk) = 1 when there is only one geodesic between pair of nodes that contain node pk. As the frequency of occurrence of pk on geodesic increases, the value of CB(pk) increases. CB(pk) is the absolute count which seems least interesting for most of the applications. Therefore, for getting the relative count it was normalized by Freeman as [41]
2E n=iHUa,j (Pk)
n2 — 3n + 2
where (n2 — 3n + 2)/2 is maximum possible value of betweenness of node pk in a graph of n nodes. In case of graph centrality, betweenness was defined by Freeman as ''average difference between centrality of the most central point CB(p*) and all other points" [41]:
En=iC (p") — cB(p, )]
(n2 — 3n + 2)(n — 1)
Betweenness is considered as a measure for control of communication with central point controlling the stream of information passing through network. CB measures of betweenness were imposed to two restrictions:
i. These measures were defined on simple binary graphs which did not include quantitative attributes and generally evaluate relations in social networks such as strength of relationship based on number of interactions [53].
ii. These measures primarily considered only geodesic paths for information flow while in practical situations message is usually passed on random path or intentionally selected path by end nodes [54].
Due to above stated reasons, Freeman et al. [53] suggested a new measure of betweenness based on Ford and Fulkerson model on valued graph in which value is assigned to each edge in the graph of network. They related this value to the strength of social linkage i.e. amount of interactions specifying the time they spent with each other or any other social settings. These attributes were numbered by values or labels of edges. The concept of capacity of channel depicted by proximity function C was stated as the total amount of information that can be passed through communication links. f is the amount of information passing through that channel link and cij is capacity of channel linking i and j. The relation between them defined by Freeman et al. is as follows [53]:
fj< Cij (9)
This equation infers that f should be less than capacities of channels linking the points through direct or indirect paths.
In Fig. 7, the path from X; to Xj is a direct path and the paths via Xk and Xm are indirect paths to reach Xj from X;. Freeman et al. used Ford and Fulkerson model [55-57], in which originator of information is called source X; and receiver of information is called sink Xj. This model posed two restrictions on flow between X; and Xj:
i. Flow coming out of X; should be equal to flow going into Xj.
ii. Flow coming in each intermediary node Xk or Xm connecting Xj to Xj should be equal to flow out of them.
On the basis of these restrictions i — j cut sets were defined. Each cut set Ej contains an edge from each and every direct and indirect path between X; and Xj, i.e. if we remove any edge of Ejj from graph then Xt and Xj become unreachable. The capacity of i — j cut set is total sum of capacities of each edge in the set [53]. The capacity of edge which is lowest in i — j cut set is termed as minimum cut capacity and maximum amount of flow from Xi to Xj should be less than this minimum cut capacity [57]. In context of betweenness model in valued graph, Xk is standing between pair of points Xt and Xj, as seen in Fig. 7, such that maximum flow between them passes through Xk, depicted by mj(Xk). CF(Xk) is the maximum flow between all the pair of points in graph that passes through Xk as defined by Freeman et al. [53]:
CfX) = ^^mij(Xk) where i < j.
To normalize this value, it is divided by total flows between all the pairs in network excluding Xk. Then above equation becomes
CF — v11)
Through this measure centralization of graph determined by [53] is given as
EL [c'f(X-) - cF(Xk)]
where CF is the average of sum of differences between central-ities of most central point and other points. The measures of betweenness have focus on frequency of arrival of packets. Betweenness measure of centrality is used in different types of network flows in which it takes two assumptions. First, it
Figure 7 X; is source, Xj is sink and information between them may flow through direct path or through Xk or Xm titled as indirect paths.
considers traffic as indivisible that can transfer from one node to another but cannot present at two places at one time through duplication. Second, traffic can take only shortest paths and flows have predefined origin and target. From Table 1, betweenness is used in packet delivery process in which packet is indivisible, present at one place at any time and can move through transfer on shortest possible path predetermined by process [51]. Betweenness measures can be used in positive ties where high level of transitivity and flow of information exist. But as discussed, flow of resources is absent in negative tie networks; therefore, it becomes very difficult to calculate betweenness measures in such type of networks [31].
2.3.3. Closeness centrality
In order to find central point or node in network, one another view can also be considered in which the point having minimum distance i.e. close to all other points in network is called central node. This node does not depend upon other nodes to deliver its message or information. Take example of star network of Fig. 6. In this, node 1 is able to deliver its message to all other nodes by itself whereas node 2 depends upon node 1 to relay its message and it is same for all other nodes [35]. This property of independence of central point is known as 'closeness' of a point. Closeness based measures were introduced by many researchers [35,38,40,58,59] as mentioned in Table 1 but the most predominant and revealing measure was given by Sabidussi [58]. He defined centrality of node as inverse of the sum of geodesics from a point to all other points in network. The measure given by Sabidussi can be written as [58]
cc(Pk) 1 = d(Pi; Pk)
d(pi, Pk) is number of edges in the geodesic from pi to pk and Cc(Pk)1 increases when distance between pk and other nodes in network increases or closeness of pk decreases with increasing distance and vice versa. If graph becomes disconnected then d(pi, pk) tends to be infinite and closeness approaches zero. The Cc(pk)—1 is dependent on graph size and gives absolute count. So to normalize it, Freeman divided it by n — 1 [41]:
Cc(Pk) =
H'n=1d(Pi, Pk)
In terms of graph centrality closeness was measured by Freeman as [41]
Cc —
En=1[Cc(p*)-Cc(Pk)] (n2 - 3n + 2)(n - 1)
In case of closeness centrality, maximum closeness can be achieved in star or wheel network and minimum in complete graph in which all the points are homogenous with no difference in their centrality. Closeness can also be interpreted as expected arrival time of something flowing through network only if it flows along shortest path [51]. If raw closeness score of node is low then node is having short distance from other nodes and receives flow early. Closeness as an index of central-ity is used only in processes that follow shortest paths in network, such as packet delivery process, or follow parallel duplication where things flow parallel through all possible paths, as evident from Table 1. In latter case, closeness does not indicate the reception speed as all possible paths are
traversed, including shortest paths, like in gossip process in which it cannot be determined who receives information and in what order [51]. Also closeness can be computed only in connected graph with flow-based network [41]. Therefore, this measure can easily be applied on positive ties where transitivity of flow is maximum, like friends of friends are friends. But in case of negative ties networks, which are usually represented by disconnected graphs, closeness becomes zero as nodes are not reachable from one another. And also things do not flow for more than one length path in negative ties, so closeness is difficult to compute for node centrality [31].
All the three centralities are three different views on structural properties of graph inferred by Freeman [41].
2.3.4. Eigenvector centrality
This centrality measure was proposed by Bonacich [60]. According to him, centrality or popularity score of a node is proportional to the sum of centralities of nodes to which it is connected. It corresponds to the eigenvector of adjacency matrix W of network:
(W — kI)S = 0 (16)
where k is the eigenvalue and S is the eigenvector. For finding appropriate scores, k should be the largest eigenvalue of the matrix. This measure was defined for only symmetric matrices and for relations such as friendship that is naturally symmetric. Matrix W should contain value between 0 and 1. If two persons are friends then Wij = 1, else 0. For exploring different measures of centrality three approaches were given by Bonacich which explained significance of using eigenvector as centrality measure [60,61].
i. Factor analysis approach:
It suggests that status scores of each actor in network can be calculated by column vector S called the first principal component factor of W, which is also the eigenvector corresponding to the largest eigenvalue of matrix W. The largest eigenvalue is standardized because it represents the length of eigenvector. By factor analysis, clique structure in network can be identified.
Clique is the group in the network in which every member is related directly or indirectly to other members of same group and the relations are not extended outside the group. The factor corresponding to each clique has zero or positive values for their elements. Zero value is for non-member of clique and positive value for member of clique. The individual having high value in factor is the one who is most popular in their cliques. The amount of variance among different factors for each clique depends upon the magnitude of eigenvalue corresponding to eigenvector of popularity scores i.e. larger the eigenvalue, better the eigenvector for analyzing relationships in the clique structure and vice-versa.
ii. Convergence of infinite sequence of status scores:
The simple measure for calculating centrality or popularity, as in case of degree based measure, is the number of friends that a person has. It is called first order measure. This process can be extended by multiplying each person's choice to select others as friends with number of friends others have and is
termed as second order measure. It can be improved further by multiplying each person's choice with second order measure. It is named as third order measure and so on. Bonacich [60] expressed this sequence as
51 = WS0 as first order measure where S0
is column vector of ones, (17)
52 = WS1 = W2S0 as second order measure, (18)
53 = WS2 = W2S1 = W3S0 as third order, (19)
and therefore, Sm = WmS0 as mthorder. (20)
This sequence tends to infinity. So to converge it, small modification was made i.e. each measure was divided by the k1, largest eigenvalue of matrix W:
S1 = (W/k1)S0 (21)
Hence, Sm = (Wm/km)S0 (22)
He proved that Sm converges to eigenvector of k1. This result is same as the output of factor analysis approach. But Sm is converged to give popularity scores of only the largest clique in the network. The smaller cliques and isolates, who are non-members of clique, are scored zero.
iii. Solution of linear equations
In order to find popularity of each person in the network, Bonacich weighted other person's contribution to it by their popularity scores. He generated system of linear equations for unknown scores as [60]
S, = W,1S1 + Wa S2 + W,3S3 +..........(23)
This was generalized as S = WS in matrix form which gave (W — 1)S = 0. For finding solution of linear equation, determinant of \W — I| =0. Bonacich made slight modification that did not affect the model but provided a solution to equation as
IS, = WnS1 + W,2S2 + W,3S3 +..........(24)
In matrix form, kS = WS. It can be written as (W — kl) S = 0. This is similar to eigenvector (finding) problem where k is largest eigenvalue of W and S is eigenvector.
All the three approaches of identifying different aspects of weighted popularity score of actors in network gave the same solution. On this basis, the eigenvector of largest eigenvalue can be considered as a measure of centrality or popularity which is the weighted sum of direct as well as indirect paths of any length to a node in network. This measure is also known as influence measure in which a node influences all of its neighbors and they further influence other nodes. From Table 1, eigenvector allows traffic to move through unrestricted walks as it counts the walks of all lengths originating from a node. It is also appropriate for measuring the long term direct and indirect risk of getting infected in virus network [50] and is used in parallel duplication processes such as attitude influencing [51]. This measure is same as degree based measures but only difference is that degree measures account for walk of one length and in eigenvector walk can be of infinite length. Eigenvector centrality is usually used in networks where degree of nodes varies widely such as low degree node is connected to high degree node. Due to this frequent range
of degrees of nodes, eigenvector can calculate relative score of each node with respect to its neighboring nodes. In networks where degree of nodes does not vary, eigenvector is as good as vector with elements representing each node's degree. Degree based measures are able to work only on simple graph with binary relations between vertices whereas eigenvector cen-trality measures are used on signed and valued graphs [62]. The variety of complex flows can be analyzed with eigenvector cen-trality. There are many variants of measures based on eigenvector that accounts for its usage for analyzing different indices. For example, status measure can be calculated in positive and negative networks by forming cliques based on values of eigenvector of adjacency matrix [63] and power can be assessed in negatively connected bargaining networks by using concept of eigenvector [64]. Therefore, this measure can suitably be applied on negative and positive ties network.
3. Influence measures
Besides centrality measures discussed yet, the ties or relations in social network can also be analyzed on the basis of degree of influence, i.e. how much node A gets affected or influenced through relations with node B in network. This can be computed by considering all the connections between them in the network. Unlike the previous measures, that determine the centrality of nodes through geodesics (i.e. shortest paths) only, influence measures consider total number of walks between pair of actors. These types of measures provide relative power or status of a person in a network by accounting all length paths between pair of nodes. But different length paths (such as, 1-length path and 10-length path) cannot be weighted equally for computing influence i.e. a weight depending on length of path should be assigned to each link so that shorter paths get more weights and longer length paths weighted considerably less. On basis of this weight, three different measures were developed: Katz [65], Hubbell measure [66] and Bonacich power measure [64].
3.1. Katz measure
Before 1953, all status measures were considering only number of choices received by an actor as an index of calculating status. However, it provides idea about the high status nodes in network, but correct ordering could not be decided alone on the basis of number of choices or links received by a node. Katz [65] then proposed a new status/influence measure in view to allow not only direct links received by an individual but also popularity or status of individuals sending links to him to be included in his score. Further, the status of each, who has link with these individuals in turn, should also be used for calculating scores in social network. If social network data are represented by a matrix W, also known as choice matrix, then value of each element of matrix Wij represent actor i chooses actor j with condition applicable that W should be positive symmetric matrix [67]. W matrix contains one length path between nodes as its elements whereas power n of W has the number of n-length path chains between pair of actors as elements. This means W| ~YTm=1WimWmj where each component WimWmj = 1 only if i chooses m and m chooses j, forming a two length path and Wi2j element of matrix contain
number of such two length paths [68]. The sum of all the elements of column j in W gives the number of direct choices of j by other individuals. Similarly, column sums of W2 give two-step choices, W3 gives three step choices of individual j and so on. Katz defined a measure of status by adding the direct, two-step, three-step etc. choices of individual j by giving lesser weights to increasing order of choices so that probability of longer chains participation be less in computation of status score. Let a be the attenuation factor of W that is the weight assigned to a single link or edge. W contains 1-length path or single links; therefore, its attenuation is a. Similarly, W2 contains two consecutive links, so its attenuation is a a, weight of W3 is a3 and so on. The end matrix contains sum of all length paths as derived by Katz [65]:
T = aW + a2 W2 + a3 W3 + ... + akWk = (I — aW)—1
T is the resultant matrix, attenuation factor 'a' taken in this case, should be such that k1 < 1/a < 2k1 where k1 is absolute magnitude of largest eigenvalue of matrix W and its value depends upon the group of individuals to which information is transmitted. Through Katz measure, most influential node or individual in positive tie network can be found who has connections with most of the other individuals and can influence or affect other individuals with his decisions or activities. The major conclusion drawn by Katz was that the status of individual in network depends not only on number of direct links but also on the status of next to direct links. For example, it can be possible that nodes B and D having 5 and 4 direct links are of low status as compared to node A which has only two immediate links with B and D. This happens because A is connected to two nodes that are high status nodes in network and node A influences all other nodes in network through these two nodes and thus obtains highest status in the network, whereas B and D get comparatively less score. As shown in Table 2, Katz measure can be applied in parallel duplication flow processes and generally follows walks of infinity length [51]. This index was not interpreted to be applied on network that has negative relations as it depends upon flow in the network and its value depends upon how many well-connected nodes are immediately connected to a node. It is concluded by Everett and Borgatti [31] that ''In negative ties the value of measure that flow through networks is analyzed rather than a flow of resources as opposed to positive ties network in which value of something flowing through network is analyzed through centrality measures". But Katz only tried to capture the value of something flowing through network of symmetric relations such as friendships.
3.2. Hubbell measure
Hubbell [66] interpreted the flow of influence through interpersonal links in social networks as input and output channels. He analyzed the association of dyads or pair of nodes in network for identification of cliques as well as measuring the contribution to the status of individual by other persons present in the network that is through computing dyadic influence. The index proposed by Hubbell has structural as well as functional significance. The structural significance of index is in identifying cliques and functional significance is in computation of status scores. He considered the structure matrix W (same as choice matrix W in Katz measure) in which Wj = 1 if j chooses i
Table 2 Characteristics of three influence measures.
Type of influence measure Katz influence Hubbell influence Bonacich power
Proposed by L. Katz C. H. Hubbell P. Bonacich
Year 1953 1965 1987
Expression of status scores t = ((I - aC) - I)-1 h = (I- bW)-1E b = (I - bR)-1R1
Attenuation factor 1/a < k1 where k1 is the characteristic root of C b is the fractional weight assigned to entries of matrix Ibl < 1/k1 k1 is the largest eigenvalue of matrix R
Type of graph (symmetric/ asymmetric) Only symmetric graphs Only symmetric graphs Both for symmetric and asymmetric graphs
Type of network ties supported Interpreted only for positive ties network Interpreted for both positive and negative ties (by varying signs of value of b) Both for positive and negative ties networks
Relation between three measures t = h - 1 t = bb where t is Katz measure, h is Hubbell measure and b is Bonacich measure
Matrix used in measure Choice matrix C, CiJ = 1 depicting actor i choosing actor j Structure matrix W, Wij depicting weights of links between nodes that can be positive, negative, zero or fractional Relationship matrix R, Rij depicting values between 0 and 1 inclusive
and Wy is the weight of link joining i with j. The value of weight can be positive, negative or zero. All the powers of structure matrix are used to consider the entire one to n-length chains between pair of nodes. Hubbell [66] performed summation of all powers of matrix and called it Y:
Y = I + W + W2 + W3 + ... (26)
This infinite series can be converged by having fractional weights in W structure matrix. As there exists two values for each dyad yy and j index of association of dyads is defined as my = myi = min(jj, j and used for identifying cliques by discriminating inter-clique and intra-clique bounds. miy and myi are the structural significance of index. Functional significance was described by status score vector s. Let Si be the status score of ith member of group structure and is given as [66]
Si = et + WilSi + was2 + W--3S3 + ... (27)
where ei is the exogenous factor and Wj is weight of link i to j. In matrix form
S = E + WS =(I - W)-1E (28)
W is the structure matrix. Here, values of column vector E are also known as boundary conditions for a system. It became necessary to include E when open systems such as communication networks are considered. Status score column vector S is defined by Hubbell as [66]
S =(I + W + W2 + W3 ...)E (29)
The interpretation of Hubbell measure is slightly different from Katz in which Wj represent weight of influence of j on i whereas in Katz Wj represent the weight by which i chooses j. Katz also ignored the boundary conditions in open system as presence of E in Hubbell's measure. The attenuation factor in Hubbell measure is the fractional weights of structure matrix W. These weights converge infinite series of powers of
W. Katz did not include identity matrix in series for calculation of total number of walks between pair of nodes. Both measures computed the flow of influence in network. In Hubbell's structure matrix, negative values of weight of links are allowed; therefore, it can be used to analyze both the positive and negative relations in sociometric data. Hubbell measure can be used in parallel duplication flow processes such as attitudinal influencing [51].
3.3. Bonacich power measure
The centrality of node does not always correspond to power of node in social networks [69]. Specifically in negatively connected bargaining networks, the most central person is not always the most powerful one. This view contradicts the extensive social literature [70-72]. In order to distinguish between power and centrality, Bonacich [64] proposed a set of measures given by c(a, b). The parameter b is used to reflect the degree and direction (positive or negative) in which individual status depends upon status of other people in network. In positive communication network of information exchange, power of a person depends positively on high status persons he/she is connected to. The parameter b attains positive value in these networks. But in case of negatively connected networks of commodity exchange, power of a person decreases by having connection to high status people who have many potential alternatives. In such situation b has negative value. The magnitude of b reflects the degree to which status score c(a, b) of a person is a local or global measure i.e. to which extent it is a function of status score of immediate other persons in network. If b = 0, then c(a, b) is similar to number of direct communication links to all other persons in network. As b increases, centralities of distant persons or indirect communication links are considered in c(a, b) score. This measure can be applied to symmetric as well as asymmetric networks [73].
This measure is based on eigenvector centrality 'e' proposed by Bonacich, given as [60]
ke, = ]T A„e, (30)
where A is adjacency matrix and k is the largest eigenvector of matrix A. In the same spirit, beta centrality is defined as [64]
c,(a, b) = 5> + Pcj)Rg (31)
Now, centrality of neighbor nodes is weighted by parameter b whose magnitude and sign define the contribution of other actors of network to power of actor ','.
In matrix form, the measure is given as [64]
c(a, b) = a(I - bR)_1R1 (32)
c(a, b) = a(R1 + bR21 + b2R31 + •••) (33)
This is the series of path-length chains. Here, absolute value of b is less than reciprocal of largest eigenvalue of R. This expression indicates the contribution of direct links and all length paths (indirect links) of each individual in the beta centrality measure. Katz centrality closely resembles the beta centrality in estimation of status of an individual in network. Let t be the Katz centrality, given as
t = bc(a, b) (34)
When b is positive both measures are perfectly correlated but when b is negative Katz and Bonacich measures become negatively correlated [74]. b in the expression can attain two values either positive, zero or negative. The measurement of power in both cases is as follows:
i. When b is positive or zero:
If b = 0, it means c(a, b) is simply a degree measure and status of person is defined by number of people directly connected to it. But when b is non-zero and positive, magnitude of b defines the degree to which score of actor is a function of scores of other actors in network. The positive value of beta is achieved in positive network, such as communication network, where information is exchanged between pair of actors. b is the probability that information is received by actor and all of its contacts. According to Eq. (33), R1 is the direct information exchange to connected ones, bR21 is the transmission of information to actors over 2-length paths and so on. The value of b decides up to which level of path length chains the information should be delivered by an actor to surrounding actors. It can be assumed that a circle is defined by radius b and value of b decides to exchange information either locally or to the structure as whole. In communication networks, the higher value of b would be appropriate to communicate over distant nodes. The interpretation of b in positive network is similar to Katz measure i.e. power of person in social network depends upon status of people to whom he/she is connected. Higher the status of surrounding people, more powerful the person in network.
ii. When b is negative:
b usually attains negative value in negative networks, such as bargaining and exchange networks. Here the interpretation
of power completely changes. Suppose in exchange network, commodity exchanged with one person cannot be exchanged with another and bargaining power of person decreases if he/ she is connected to high status people who have many other potential alternatives.
Reconsidering Eq. (33), the function of b can be analyzed as follows:
If b is negative in above equation, then terms of even power of R decrease the score of person and terms of odd powers of R increase value of c(a, b). This can be interpreted as follows:
a. aR1 indicates that having many connections increases power and centrality.
b. abR21 implies that if the connections of person have many other alternatives in exchange networks then it decreases power of person but increases his/her centrality.
c. ab2R31 means if there are many paths of 3-length or many connections of nodes at 2-length path then it decreases power of immediate connections of person and consequently increases his own power [64].
This can be explained by the following example:
In Fig. 8 of tree graph depicting exchange network, the power of A increases due to connections to B and C at length one. But, as there are many connections to B and C at 2-length which make them central, it results in decrease of power of A. Consequently, D, E, G, and H are also having many connections, which decreases power of B and C but increases the power of A. This concludes that power of A is increased due to more number of nodes at 1-length and 3-length paths and decreased due to the presence of many nodes at 2-length paths.
Thus, beta centrality differs from Katz measure of calculation of status. c(a, b) measure is useful in valued and signed graphs, negative ties and positive ties networks. It is especially sensitive in situations where many low degree nodes are connected to high degree nodes i.e. where difference of degree is present to drive centrality. In many situations, beta centrality is similar to degree measure, such as in regular graphs where degree of nodes does not vary when elements of eigenvector corresponding to largest eigenvalue sum up to zero and when there are multiple eigenvalues of same magnitude existing for symmetric matrix [62].
Figure 8 Exchange Network, where A is focal actor having connections with B and C with whom he exchanges commodity.
4. Projected approaches for analyzing ties
4.1. Graph complement approach
As all standard approaches and centrality measures are not suitable for negative ties but applicable to positive relations in network, in order to implement these techniques on negative networks, graph complement approach can be used [31]. In this approach, graph containing negative ties is complemented to form positive tie network. In complement graph, all edges that were present in negative tie graph are absent and viceversa. Now this graph represents positive tie network and all network analysis techniques can be applied to this graph. This is very intuitive approach and can make its way to apply all techniques in negative tie data. For example, consider the Fig. 9 depicting the negative ties graph (G) and its complement (G') containing edges that are absent in G.
There can arise many problems with this approach. First, it is not necessary that absence of negative ties between nodes means positive relations in complement graph. If the network states that a node exists in one of the two states, such that if a relationship is not in state 1 then it should be in state 2 only, then the graph complement approach can work out. Second, the property of graph states that both G and G' cannot be disconnected simultaneously i.e. if graph of negative ties G is connected then it is not necessary that G' is also connected. In some situations, G' becomes disconnected, then centrality measures such as closeness and betweenness cannot be used for analyzing ties. Third, besides structural differences of properties of graph and its complement, only degree centrality can be analyzed in G'. Suppose, the degree of node 'a' in G is x, then degree of same node in G' becomes n — 1 — x, where n is total number of nodes in G. Degree centrality in G' is derived from degree centrality in G. But, it is not true for betweenness and eigenvector centralities. The methods for which results in complement graph can be deduced from original graph are called as complement consistent methods [31]. To find equivalent actors, the approaches such as structural equivalence and automorphic equivalence are complement consistent but regular equivalent nodes are not present in complement graph. In fact, all centrality measures are not complement consistent except degree centrality. The density of negative tie network is usually low; thus, complement graph G' has high density and centrality values show little variance. All the methods discussed in previous sections that are interpretable for both
positive and negative networks yield results both in G and G', but measures that are not applicable in negative networks due to their structural features can be analyzed through complement graph [31]. As there are many problems with complement approach also, there is a need of new set of measures that can be applied specifically on negative tie data and will be discussed in next section.
4.2. Negative cohesive subgroups
There is a vast literature on concept of cohesive subgroups in networks that are formed as a result of relationships between the nodes. This subgroup is popularly known as 'clique'. Luce and Perry [75] defined clique as ''the subset of group consisting three or more members, each having symmetric relationship with each other and no member outside the subset exist that is in symmetric relation to every member of subset". Many researchers have defined clique and also developed many measures for identifying them in networks. For example, Hubbell [66] in 1965 developed an input-output approach for clique identification, Bonacich [60] in 1972 developed eigenvector centrality measure for finding cliques in symmetric graphs of social network and Everett and Borgatti [76] in 1998 analyzed the overlapping cliques in network. The structure of clique can be recognized from various powers of symmetric matrix, say A (derived from adjacency matrix). The main diagonal entries of matrix A2 (a(2)) denote the number of chains of length two and also the number of elements in graph with which element T is in symmetric relation. This set of elements along with i forms a clique. In order to identify whether a given element belongs to clique, its main diagonal entry in A3 is checked for positive value. If this entry is positive then element belongs to clique and if it is zero then element is not a member of any clique [75]. The complete structure of clique was not defined by them. Everett and Borgatti [76] tried to determine the overlapping among cliques in network by analyzing their structures. They made use of UCINET tool of social network analysis [77]. The standard method of clustering was used by the tool to identify groups in network but failed to identify structure and overlapping of cliques. Therefore, they defined clique graphs and clique overlap graphs. Clique graph or intersection graph has clique as vertices and edge between them represent that cliques have some actors in common. Clique overlap graphs are iterative clique graphs that represent the amount of overlapping between cliques in each iteration [76]. For
Graph 'G' of negative ties
Complement Graph G' of positive ties
Figure 9 Graph of negative ties network and its complement depicting positive ties network.
example, there are six cliques namely 1, 2, 3, 4, 5 and 6 as shown in Fig. 10, each having actors common among them as depicted by numbers on edges. This graph G is named as clique graph, K(G). The iterative overlap graphs are K2(G), K3(G) and so on. K2(G) contain edges between cliques which have two or more than two actors in common and similarly, K3(G) contain edges only between cliques having three or more than three actors in common. The clique overlap graphs describe amount of overlap among cliques and also help in analyzing structures of cliques. By using these graphs, Everett and Borgatti [76] defined co-clique matrix C, with entry cy as the number of actors that clique i has in common with clique y. The results of implementation showed that structures and overlapping of cliques were clearly identified. The entries of actor by clique matrix contain the number of cliques in which each actor is present. This is known as clique overlap central-ity. When this measure was compared with other centrality measures, results showed that it correlates with others in core or periphery regions of network. When multiple such regions are present then this measure gives different values as output [76]. The above discussion proves that clique is very important concept in analyzing positive ties in network. But in case of negative ties, clique represents a group in which each member is in negative relation with each other and group formed is also not cohesive. Everett and Borgatti [31] defined negative clique in which every member outside the group has negative relations with members of the group. This group is the smallest group of individuals in network such that everyone outside the group dislikes at least one individual from inside the group. The negative cliques also overlap with each other and this overlapping can be analyzed with same methods that were used for ordinary cliques. The negative clique can be generalized by specifying its maximum size (just as minimum size in general clique) and defining negative k-plex clique. The negative clique overlap centrality can be computed for each actor by counting number of negative cliques in which it is present. Negative clique usually represents the highly undesired, disruptive group of individuals and thus disliked by everyone outside the group. Negative cliques can be found by key player program [78] in which those key players form a group, who can reach everyone in network. This negative clique concept is future area of research for analyzing negative ties. It can
be used in identifying the group of pupils in classroom who are responsible for bullying other students. Such a group is disliked by everyone in the class and forms negative cliques.
5. New measures of negative ties
Earlier approach of graph complement requires some new measures that specifically deal with negative relations and helps in identifying such ties between pair of actors in the network. Therefore, existing measures of positive relations such as degree centrality and eigenvector centrality are tailored in such a way so that they satisfy conditions of negative ties network. These new measures are given by Everett and Borgatti [31] and are analyzed by applying on directed networks.
5.1. Negative degree centrality
Degree centrality of a node in negative networks is defined as degree of a node in complement graph, n — 1 — x, where x is degree of node in graph G [31]. In order to normalize, divide this degree by n — 1, i.e.
d*(x) = 1 — x/(n — 1) (35)
This measure gives low score to actors who are disliked by most of the others and high score to actors who are disliked by less people. In directed graphs, d*n and d'oul are the in and out degrees of nodes. High d*n score means that an actor receives less number of negative ties and high d*out score means that an actor sends lesser number of negative ties to other actors in network. A person with low dOn and dOut score is the most disliked person and similarly an actor is most popular in network if he/she has high dOn and d"out score [31]. Negative degree weights every actor equally and does not consider the position of actor with respect to other popular people in network i.e. an actor connected to highly central actor (most popular actor) gets relatively high score as compared to an actor connected to disliked person in network. This measure is related to structural property of node which signifies the number of nodes it is connected to but not to which node it is connected. It fails to identify the centralities of neighbor nodes and their effects on centrality of given node [31].
Clique graph
Figure 10 Clique graph, first iterative overlap graph K2(G) and second iterative K3(G) graph. Nodes of graph depicting cliques and number of overlapping actors are written as labels of edges.
5.2. h centrality
As negative degree measure has its limitation in evaluating centrality of node, there is a need of measure that considers negative connections to less central (most disliked) person is better rather than having negative ties with most central (popular) actor in network. This aspect of calculating centrality is similar to eigenvector centrality as we discussed in Section 2.3.4. Everett and Borgatti [31] have recently proposed a new measure known as h *. The centrality of node x is denoted by h*(x) [31]:
h*(x) = 1 - E
yeN(x)
where N(x) represents neighborhood of node x. In matrix form, equation is given as
h* = I
where I is identity matrix, 1 is a column vector of 1's and A is the adjacency matrix. This measure is similar to Katz [65], Hubbell [66] and Bonacich [64] measures. h*(y) is similar to eigenvector centrality of node y. Only difference is that Eq. (36) is normalized by dividing it by n — 1 and to bring it in 0 to 1 range it is subtracted from 1. The h of Eq. (37) is similar to Hubbell measure, h = (I — bA)—*1, where b = — 1/(n — 1). But Katz and Hubbell did not consider negative beta. The concept of negative beta was proposed by Bonacich [64] in negatively connected bargaining networks where it is more beneficial to connect with ones that have less options or alternatives. This increases centrality of node. The same implication is used in h measure for negative tie networks and used negative beta. This measure has time complexity of O(n3). In case of directed networks, h*n and h"out are also calculated for each node. An actor with low h* score receives many negative ties from other actors, especially those who send very less negative ties to other actors and have high h*ut score. This means receiving negative ties from popular actors of network greatly affects the centrality of actor. In similar fashion, an actor with low h'out score is the one who sends negative ties to most of the actors, especially those who have very less incoming negative ties and have high h* score [31]. It means this person dislikes the one, whom everyone likes. The directed formulae of h measures are given as follows [31]:
h*n(y)
h*ut(x) — 1 - E
yeNo(x)
h*n(x) — i - X hrr
yeNi (x)
No and N, are out and in neighborhood of node x. Equations
in matrix form are as follows:
hl — I-
h* = I
(n - 1)
,- 1)2
(n - 1)2
>- 1)2
where A is the transpose of matrix A. The h measure presents more accurate and precise results as compared to negative degree centrality d . For each actor in network h measure gives unique value of centrality as compared to d which gives multiple actors the same score. Therefore, h* is refinement over d [31]. Both h and d measures can be extended to accommodate valued data only when largest eigenvalue of adjacency matrix is less than n — 1.
6. Mixed data measures
Today's online social networks such as Facebook, Twitter, Orkut, and LinkedIn include all types of relations of friendship, support, liking, disliking, avoidance, distrust or no relation at all between various nodes. The measures discussed till now are applicable either on positive relations or on negative relations but in order to cater all aspects of relations possible in social networks, there is a need of measures that can analyze both of these relations jointly. Many researchers have proposed measures that consider data of both types.
6.1. Degree centrality measure
This measure is similar to negative and positive degree central-ity discussed earlier. Pair of actors connected to each other through positive relations has positive degree and if they are connected negatively then they have negative value of degree with respect to each other. Thus, overall degree of an actor in network is the summation of negative and positive values of degree as shown in Table 3. The person with overall high negative value of degree is the most popular one in network. This measure gives the idea of status of person in network but again this does not generate precise value if two actors are having nearly similar connections with other actors in network. This means it can assign same value of degree to two or more persons in network [31]. Also the effect of popularity of person on status of neighbor nodes is not included in this measure. Therefore, a more general measure is required.
6.2. Status measure
Bonacich and Lloyd [63] proposed status measure using eigenvectors. In a network of both positive and negative relations, a status of individual rises if he/she is positively connected to high status individual and status decreases by having negative relations to the same. If individual is having positive relations with negative status persons then also his/her status reduces. This measure is defined by using eigenvector and diagonal matrices. Adjacency matrix A, has ai, equal to 1 if i and , are positively connected, —1 if they are negatively connected and 0 if there is no relationship. Therefore, it forms a balanced graph. The eigenvector of A corresponding to largest eigenvalue with positive and negative elements reveals the clique structure. All the actors with positive value of status belong to one clique and negative value actors belong to another clique. Positive relations exist between members of same clique and negative relations between members of different cliques. The status of individual actor increases with increase in number of positive connections to members of same clique and number of negative connections to members of different cli-
Table 3 Comparison of Mixed data measures.
Type of Degree centrality mixed data
Status measure
Political Independence Index (PII) PN centrality
Proposed by
Expression
Weightage of direct and
indirect links
Applicable on types of graphs
Positive degree by L. C. Freeman Negative degree by Everett and Borgatti
1979 2014
P(x) + N(x), P(x) is number of direct positive ties and N(x) is no of direct positive ties
Direct links = 1, Indirect links = 0
Both directed and undirected graphs
P. Bonacich and P. Lloyd
Ax = kDx
A is adjacency matrix; Dx is eigenvector depicting status scores of each node
Direct links = 1, Indirect links = bk, where b = 1/k, kl is the largest eigenvalue, k is the number of edges between nodes
Only on undirected graphs because on directed graph eigenvalue may not be real
Concept Positive and negative Based on balance theory
used for
analyzing
relations with nodes connected directly
Complexity O(n)
forming two cliques where positive relations exist within cliques and negative relations between cliques
Smith et al.
EtobW)x — N(i)x]
P(i) and N(i) are positive and
negative edges incident on node
Direct links = P(0)x - N(0)x, Indirect links = (8)', i is the distance of edge from focal actor and x an exponent which ensure that more weight is assigned to direct links as compared to indirect links
Only on undirected networks
M.G. Everett and S.P. Borgatti
PN = [I — 2n—2 A —1
A = P — 2N, P is positive tie matrix and N is negative tie matrix
Direct links = b, Indirect links = bk, k is the path-length from given actor, b =1/ (2n — 1), for large network size b = 1/(2d)
Both on directed and undirected graphs
Based on concept of positive and Based on positive P and
negative edges at a distance of 1 from a node in political networks
negative tie N matrices. Matrix A is formed by: A = P — 2N
measure
ques [63]. This measure solves the problem of analyzing positive and negative ties jointly up to some extent. But in some cases, when balanced symmetric adjacency matrix A has multiple large eigenvalues having same magnitude, then there exists multiple linearly independent eigenvectors [31]. Out of these, it becomes difficult to choose which eigenvector should be used to assign status score. Secondly, if network has multiple components or cliques similar in structure, that is connections between members of same clique and members of other clique are similar for multiple cliques, then the eigenvector of matrix A is not able to describe score of all the actors of component correctly [62]. Thus, eigenvalue assigns anomalous scores to actors of the cliques. Also, if graph is directed then eigenvalues may not be real.
6.3. Political Independence Index (PII)
Measures discussed till now, focused on actor's characteristics and attributes in positive networks. All of them counter the direct and indirect access of flows of information between pair of nodes for analyzing positive and negative ties as well as power or status acquired by each actor in network. Smith et al. [79] proposed a power measure, known as Political Independence Index (PII), according to which any actor (also called as focal actor) in the network is dependent on other actors in respect of control of resources, support and
information. Those other actors become more powerful than focal actor in politically charged networks of both positive and negative ties. According to PII measure, an actor can reduce his/her dependence by developing alternatives that are weaker and itself in direct threat. They classify power measures as power-as-access that include degree, closeness, eigenvector, Bonacich-power-measure with positive beta and power-as-control that include betweenness, Emerson's social exchange theory [80,81], Graph theoretic power index (GPI) [82,83], and Bonacich-power-measure with negative beta. All these power-as-control measures compute power of an actor as an extent to which actor controls flow of resources and exchange of information between pair of other actors through direct or indirect links. PII measure is also based upon power-as-control approach and is highly related to Bonacich power measure with negative beta. This index is defined for politically charged networks consisting of alliances (positive ties) and adversaries (negative ties). Smith et al. [79] proposed that when adversaries of an actor grow then actor forms alliances with other friendly actors to safeguard itself in exchange of some resources. The extent to which resources are sacrificed depends upon urgency in making alliances with alternatives. When number of alternatives (allies) is limited for making alliances then these potential alternatives can extract maximum resources from focal actor which decreases its power. These allies must not have many other alternatives with whom they can make alliances as it
directly decreases power of focal actor. The measure uses concept of distance of an edge (m, n) from focal node u as MIN (d (u, m), d(u, n)) where d is geodesic distance between nodes. For direct ties, d(u, u) is 0 and d(u, v) is 1. Therefore, distance of edge (u, v) from u is 0 and progressively it increases by 1 at each length path.
PII measure can be stated as [79]
¿mox - N(i)x]
where P(i) and N(i) are the number of positive and negative edges at distance i from a node, b is the attenuation factor and x is an exponent defined as
ln(2) — ln(|b|) •
i.e. ßMx < 2
Here, M is the maximum number of ties from any node in network. The attenuation factor provides flexibility to measure by behaving both as power-as-access measure when b is positive and power-as-control when b is negative. As b has power of i, which is the distance of edge from a node, it takes into account the fact that in the presence of more number of alternatives or allies to direct link node at distance one decreases the power of focal node. The exponent factor x is used to take into account the contribution of two direct links to power of focal actor, more than one direct link and one indirect link. This implies that nearby actors in network greatly affect power of focal actor as compared to distant actors [79]. The interpretation of this measure is slightly difficult in mixed network of positive and negative ties. For example, consider network of mixed data in Fig. 11.
According to status measure, actor B is given high score as it is connected to other well-connected actors such as C, D and A. But PII measure assigns very low score to actor B as it is least powerful because it has direct links with C, D, A who are having allies or alternatives E, F, G, H that are totally dependent on them with no positive connections to any other nodes except C, D, A. This example shows that PII does not capture the idea of centrality like other normal centrality measures do. It actually deals with power of an actor which
Figure 11 Network of nodes having positive relations depicted by solid lines and negative relations depicted by dashed lines.
gets differentiated from centrality when negative ties are considered.
6.4. PN centrality measure
Fourth measure is an extension of h* measure in negative data networks. Everett and Borgatti [31] proposed PN centrality measure which uses h* measure for negative ties and Hubbell measure for positive ties. The value of b is chosen as 1 /(2n — 2) to normalize Hubbell centrality scores in complete network and matrix A is calculated by P — 2N, where P is positive tie matrix and N is negative tie matrix [31]: —i
Directed version of PN measure is
PNn= I-
PN„„, — I
4(n — 1)2
4(n — 1)2
2(n — 1)'
"2Й—1)
The values of PNin and PNout are analyzed in the same way as h' and h'oul. Lower the value of PNin means that person is having more number of incoming negative ties from other people, particularly from those who are popular in network i.e. having high PNout score people. Similarly, high PNin value means receiving lesser number of negative ties and more number of positive ties. Various other permutations are also possible. Range for PN scores for only positive tie network is 1-2, for only negative tie network it is 0-1 and for both positive and negative ties network it is —1 to 2. Complexity of PN is O(n3). Other values of b can be used such as 1/(2d), where d is maximum positive or negative degree of any node in network. But this value of b should always be smaller than 1/k, where k is the largest eigenvalue so that centrality scores of actors should remain proportional to scores of neighboring actors [31].
Negative tie measures discussed yet, are based upon the scores of neighboring actors to which a given node is connected. In such measures, only the scores of actors propagate through network as it is very difficult to interpret flow of resources through it. As opposed to positive ties, where cen-trality measures try to capture flow of information, resources, etc. through network due to high level of transitivity, the value of measure does not need to propagate through it. The measures that are used for analyzing negative ties are all based on this idea of capturing flow of scores assigned by indices.
7. Strengths and weaknesses of measures
The strengths and weaknesses of all the measures discussed in this paper are summarized in Table 4.
As it is clear from the table, only mixed data measures are efficient in analyzing both the positive and negative ties simultaneously in social network, whereas, centrality and influence measures are able to identify only one type of tie at a time (i.e. either positive or negative). Everett and Borgatti [31] analyzed the accuracy of mixed data measures in identifying outsiders of social networks by applying them on small datasets
Table 4 Strength and weaknesses of all measures.
Measure Measure Strength Weaknesses
types name
Centrality Degree • It is independent of flows in network • It cannot calculate the centrality possessed by a node
measures centrality • It is used to calculate the direct influence due to the presence of nodes, located at more than
• It can be used for analyzing both the negative and one length path from a given node
positive ties
Betweenness • It calculates centrality of node based upon its prop- • It is dependent on flows in network
centrality erty to control stream of information passing • It cannot analyze negative ties network where flows
through network are absent
• It can be used for identifying positive ties
Closeness • The centrality score given by closeness computes • This measure is totally dependent on flows in
centrality node's efficiency in delivering message to other network
nodes of network in shortest possible time • In case of disconnected graphs of negative networks,
• It can identify all the connections that a node has in closeness becomes zero for unreachable nodes
positive networks
Eigenvector • It calculates centrality possessed by a node based on • In case of very large networks, eigenvalues of net-
centrality centrality of neighbor nodes located at one or more work may not be real or there may exist multiple
than one length path eigenvectors
• It also includes the impact of centrality of neighbor-
ing nodes in score of each node
• It can be applied on both positive and negative
networks
Influence Katz • It was observed first by Katz that the status score of • It is not suitable for analyzing negative ties in a net-
measures measure node depends not only upon direct links but also on work as it depends upon flow of resources
status of next to immediate links
• It works perfectly well with positive ties network
Hubbell • It allows negative weights of links, therefore, • It is not effective in calculating centrality of relations
measure includes negative relations, such as enemies, in in negative networks
computation
Bonacich • It computes centrality of nodes based on all walk • Though it is able to analyze ties in both networks by
power lengths and can be used in both positive and nega- using positive and negative values of b but cannot
tive networks analyze both the ties simultaneously in mixed
networks
Mixed Degree • It can identify both the positive and negative ties • It does not consider the effect of popularity of person
data measure simultaneously in mixed networks on status of neighbor nodes
measures • It fails to distinguish between two actors if they have
similar connections
Status • It can identify both the positive and negative ties in • In larger networks multiple larger eigenvalues may
measure small networks exist with same magnitude
• Also there may not exist any real eigenvalue which
becomes difficult to choose single eigenvector that
will represent correct status scores
PII measure • It correctly analyzes power of node in network of • It cannot identify the most positive and most nega-
allies and adversaries tive nodes in correct order in mixed data networks
PN • It can identify the popular and unpopular nodes of • The results of this measure are still unexplored in
centrality network correctly with unique score assigned to large online datasets
every node
such as Sampson's monastery [17] and Bank Wiring Room [15]. These datasets were gathered by original researchers physically present at location, inferring the relationship between nodes by observing their behaviors. They pointed out that PN centrality is better than status and PII measures as both of them are not able to identify either central or peripherals (isolates) in the network. When PN is compared to degree, they concluded that PN is more accurate and maps small differences between centralities of nodes to considerably precise values that can be easily compared for analysis purpose. The accuracy of result obtained by Everett and Borgatti [31] was determined merely on the basis of analysis made from application of measures on small offline datasets. The results could
have been verified by applying on large datasets of current online social networks with thousands of nodes interacting with each other. For verifying the same, we have applied these measures on sample collected from large online network of Epinions and Slashdot and observed that PN centrality outperforms the other measures. The result obtained indicates that only PN measure identifies most of the outsiders of network efficiently as compared to degree measure which is unable to identify the most popular nodes of network correctly. The status measure becomes inapplicable on large datasets due to imaginary eigenvalues of matrix. Also, PII measure is not able to identify outsiders in large datasets and therefore, performs similar to that observed in small datasets. Further
the work can be carried out to enhance the performance of PN centrality in datasets of online social networks by varying the value of parameter b.
8. Conclusion and future scope
The exploration of negative ties in social networks becomes very necessary in today's scenario where social interactions are carried out online and reality of intended person can be fake with whom one communicates. This may lead to many social crimes such as masquerading, bullying, and group conflicts which need to be dealt with by observing the structure and connections in network. This study of analyzing networks is known as social network analysis. The analysis of negative ties such as distrust and opposition may also help many users of consumer review websites such as Epinions.com and Slash-dot.com in deciding which product to purchase by checking their rated reviews. In this paper, a survey of network analysis techniques that can be used to examine such type of negative relations is presented. From discussion, it has been observed that negative ties face difficulties in exploration through standard methods and techniques. The view of analyzing negative ties, in the same way as positive ties, merely by reversing the interpretation is not completely true.
The paper is structured into different sections. Section 2 discusses the application of standard concepts such as equivalence of nodes, statistical models, and centrality methods on negative tie networks. Some concepts are perfectly suitable such as equivalence concept and statistical techniques while many others such as degree centrality require few alterations in interpretation but are applicable in both ties. Other concepts such as betweenness and closeness centralities rely on network flows and thus cannot be applied on negative ties where flow among nodes of a network is minimum. Difference between centrality measures is presumed on the basis of various parameters and type of flows they follow. In third section, examination of influence measures applied to negative ties shows that some measures are easily adaptable to positive and negative ties by adjusting value of attenuation factor, such as Bonacich power and Hubbell measures, while others are inapplicable due to absence of network flows in negative ties.
Due to these issues, some projected approaches, such as Graph complement approach and Negative cliques, that can allow positive tie measures to be applied on negative ties data by reversing the interpretation are discussed in fourth section of paper. But they too pose some restrictions on type of networks. Therefore, some new measures such as negative degree and h* measure were developed that were designed specifically for analyzing negative ties. These are discussed in Section 5 of the paper. Out of these two, h* measure proves to be more efficient as it is based upon concept of calculating centrality of given node by considering the centralities of neighbor nodes. In order to analyze both types of ties jointly, mixed data measures proposed by many researchers are described in Section 6 and comparisons are made in terms of their capability in identifying negative ties in network. The strengths and weaknesses of all the discussed measures are presented in seventh section. From this discussion, it is observed that only PN centrality measure is able to work out in practical social networks and can identify outsiders perfectly well.
As current field of research is quite unexplored, therefore not much work is done to analyze ties in online social networks. Hence, we analyzed the applicability of mixed data measures on small samples of Epinions network dataset and found that PN measure out of all has performed quite well in identifying outsiders in online networks also. Although a number of methods are available for analyzing ties in negative as well as positive tie networks but still there are some areas that need to be worked upon:
i. Most of the measures discussed in paper, are analyzed by applying on standard datasets containing old and offline data. These measures need to be studied in recent data of online social networks such as Facebook, Twitter and LinkedIn.
ii. PN measure can be further improved by varying value of b (attenuation factor) for better analysis of the behavior of relations in online social networks.
iii. Measures for analyzing ties based on Freeman's degree and betweenness views of centrality need to be further explored for its application in negative ties networks by using new approaches, such as Graph complement.
iv. A combination of eigenvector centrality and Freeman centrality measures can be made to develop new measures that will exploit properties of both kinds of indices.
v. The concept of negative clique can be further explored for developing new measures based on group behavior in negative ties network.
vi. More optimized results can be achieved by first identifying cliques in social networks and then applying measures such as PN centrality for identifying actors showing negative behavior.
References
[1] Schneider F, Feldmann A, Krishnamurthy B, Willinger W. Understanding online social network usage from a network perspective. In: Proceedings of the 9th ACM SIGCOMM conference on internet measurement conference; 2009. p. 35-48.
[2] Kumar MJ. Expanding the boundaries of your research using social media: stand-up and be counted. IETE Tech Rev 2014;31 ():255-7.
[3] Davis JA. Clustering and structural balance in graphs. Hum Relat 1967.
[4] Heider F. Attitudes and cognitive organization. J Psychol 1946;21 ():107-12.
[5] Cartwright D, Harary F. Structural balance: a generalization of Heider's theory. Psychol Rev 1956;63(5):277.
[6] Doreian A, Mvar P. Partitioning signed networks. Soc Networks 2009;31:1-11.
[7] Boyd JP. Social semi-groups: a unified theory of scaling and blockmodeling as applied to social networks. Fairfax, VA: George Mason University Press; 1990.
[8] Borgatti SP, Daniel JB, Halgin DS. Social network research: confusions, criticisms, and controversies. Res Sociol Org 2014;40:1-29.
[9] Bohn A, Buchta C, Hornik K, Mair P. Making friends and communicating on Facebook: implications for the access to social capital. Soc Networks 2014;37:29-41.
[10] Box-Steffensmeier JM, Christenson DP. The evolution and formation of amicus curiae networks. Soc Networks 2014;36: 82-96.
[11] Smith JA, McPherson M, Lovin LS. Social distance in the United States Sex, Race, Religion, Age, and Education Homophily among Confidants, 1985 to 2004. Am Sociol Rev 2014;79(3): 432-56.
[12] de Jong JP, Cur§eu PL, Th AJ Leenders R. When do bad apples not spoil the barrel? Negative relationships in teams, team performance, and buffering mechanisms. J Appl Psychol 2014;99 ():514.
[13] Lakkaraju K. Reducing diffusion time in attitude diffusion models through agenda setting. In: Proceedings of the 2015 international conference on autonomous agents and multiagent systems, international foundation for autonomous agents and multiagent systems; 2015.
[14] Zsofia B, Neray B. Inter-ethnic friendship and negative ties in secondary school. Soc Networks 2015;43:57-72.
[15] Roethlisberger FJ, Dickson WJ. Management and the worker. Cambridge: Cambridge University Press; 2003.
[16] Read K. Cultures of the central highlands, New Guinea. Southwestern J Anthropol 1954:1-43.
[17] Sampson S. Crisis in a cloister. Unpublished PhD Thesis. Cornell University; 1969.
[18] Burke M, Kraut R. Mopping up: modeling wikipedia promotion decisions. In: Proc CSCW; 2008.
[19] Guha RV, Kumar R, Raghavan P, Tomkins A. Propagation of trust and distrust. In: Proc 13th WWW; 2004.
[20] Massa P, Avesani P. Controversial users demand local trust metrics: an experimental study on epinions.com community. AAAI '05. AAAI Press; 2005. p. 121-6.
[21] Brzozowski MJ, Hogg T, Szabo G. Friends and foes: ideological social networking. In Proc 26th CHI; 2008.
[22] Kunegis J, Lommatzsch A, Bauckhage C. The slashdot zoo: mining a social network with negative edges. In: Proc 18th WWW; 2009. p. 741-50.
[23] Lampe C, Johnston E, Resnick P. Follow the reader: filtering comments on slashdot. In: Proc 25th CHI; 2007.
[24] Breiger RL, Boorman SA, Arabie P. An algorithm for clustering relational data with applications to social network analysis and comparison with multidimensional scaling. J Math Psychol 1975;12:328-83.
[25] White DR, Reitz KP. Graph and semigroup homomorphisms on networks of relations. Soc Networks 1983;5(2):193-234.
[26] Everett MG, Borgatti SP. Regular equivalence: general theory. J Math Sociol 1994;19(1):29-52.
[27] Labianca G. Negative ties in organizational networks. Res Sociol Org 2014;40:239-59.
[28] Lorrain F, White HC. Structural equivalence of individuals in social networks. J Math Sociol 1971;1(1):49-80.
[29] Hubert L, Schultz J. Quadratic assignment as a general data analysis strategy. Br J Math Stat Psychol 1976;29(2):190-241.
[30] Dekker D, Krackhardt D, Snijders TAB. Sensitivity of MRQAP tests to collinearity and autocorrelation conditions. Psychome-trika 2007;72(4):563-81.
[31] Everett MG, Borgatti SP. Networks containing negative ties. Soc Networks 2014;38:111-20.
[32] Bavelas A. A mathematical model for group structures. Hum Org 1948;7(3).
[33] Leavitt HJ. Some effects of certain communication patterns on group performance. Unpublished PhD thesis. Massachusetts Institute of Technology, Cambridge; 1949.
[34] Smith SL. Communication pattern and the adaptability of task-oriented groups: an experimental study. Cambridge, MA: Massachusetts Institute of Technology; 1950.
[35] Bavelas A. Communication patterns in task-oriented groups. J Acoust Soc Am 1950.
[36] Cohn BS, Marriott M. Networks and centres of integration in Indian civilization. J Soc Res 1958;1(1):1-9.
[37] Pitts FR. A graph theoretic approach to historical geography. Professional Geogr 1965;17(5):15-20.
[38] Beauchamp MA. An improved index of centrality. Behavioral Science 1965;10(2):161-3.
[39] Mackenzie KD. The information theoretic entropy function as a total expected participation index for communication network experiments. Psychometrika 1966;31(2):249-54.
[40] Rogers D. Sociometric analysis of interorganizational relations: application of theory and measurement. Rural Sociol 1974.
[41] Freeman LC. Centrality in social networks conceptual clarification. Soc Networks 1979;1(3):215-39.
[42] Shaw ME. Group structure and the behavior of individuals in small groups. J Psychol 1954;38(1):139-49.
[43] Garrison W. Connectivity of the interstate highway system. Pap Region Sci 1960;6(1):121-37.
[44] Faucheux C, Moscovici S. Etudes sur la creativite des groupes taches, structures des communications et reussite. Bull C.E.R.P. 1960;9:11-22.
[45] Mackenzie KD. Structural centrality in communications networks. Psychometrika 1966;31(1):17-25.
[46] Czepiel JA. Word-of-mouth processes in the diffusion of a major technological innovation. J Market Res 1974:172-80.
[47] Nieminen UJ. On the centrality in a directed graph. Soc Sci Res 1973;2(4):371-8.
[48] Nieminen J. On centrality in a graph. Scandinavian J Psychol 1974;15:322-36.
[49] Kajitani T, Maruyama Y. Functional expression of centrality in a graph - an application to the assessment of communication networks. Electron Commun Jpn 1976;5924:9-17.
[50] Borgatti SP. Centrality and AIDS. Connections 1995;18(1): 112-4.
[51] Borgatti SP. Centrality and network flow. Soc Networks 2005;27 0:55-71.
[52] Shimbel A. Structural parameters of communication networks. Bull Math Biophys 1953;15(4).
[53] Freeman LC, Borgatti SP, White DR. Centrality in valued graphs: a measure of betweenness based on network flow. Soc Networks 1991;13(2):141-54.
[54] Stephenson K, Zelen M. Rethinking centrality: methods and examples. Soc Networks 1989;11(1):1-37.
[55] Ford LR, Fulkerson DR. A simple algorithm for finding maximal network flows and an application to the Hitchcock problem. Rand Corporation; 1955.
[56] Ford LR, Fulkerson DR. Maximal flow through a network. Can J Math 1956;8(3):399-404.
[57] Ford LR, Fulkerson DR. Flows in networks. Princeton; 1962.
[58] Sabidussi G. The centrality index of a graph. Psychometrika 1966;31(4):581-603.
[59] Moxley RL, Moxley NF. Determining point centrality in uncon-trived social networks. Sociometry 1974:122-30.
[60] Bonacich P. Factoring and weighting approaches to status scores and clique identification. J Math Sociol 1972;2(1):113-120.
[61] Ruhnau B. Eigenvector-centrality—a node-centrality? Soc Networks 2000;22(4):357-65.
[62] Bonacich P. Some unique properties of eigenvector centrality. Soc Networks 2007;29(4):555-64.
[63] Bonacich P, Lloyd P. Calculating status with negative relations. Soc Networks 2004;26(4):331-8.
[64] Bonacich P. Power and centrality: a family of measures. Am J Sociol 1987:1170-82.
[65] Katz L. A new status index derived from sociometric analysis. Psychometrika 1953;18(1):39-43.
[66] Hubbell CH. An input-output approach to clique identification. Sociometry 1965:377-99.
[67] Forsyth E, Katz L. A matrix approach to the analysis of sociometric data: preliminary report. Sociometry 1946:340-7.
[68] Festiger L. The analysis of sociograms using matrix algebra. Hum Relat 1949.
[69] Cook KS, Emerson RM, Gillmore MR. The distribution of power in exchange networks: theory and experimental results. Am J Sociol 1983:275-305.
[70] Leavitt HJ. Some effects of certain communication patterns on group performance. American Psychological Association; 1951.
[71] Berkowitz L. Personality and group position. Sociometry 1956: 210-22.
[72] Shaw ME. Communication networks. Adv Exp Soc Psychol 1964;1:111-47.
[73] Bonacich P, Lloyd P. Eigenvector-like measures of centrality for asymmetric relations. Soc Networks 2001;23(3):191-201.
[74] Borgatti SP, Everett MG. A graph-theoretic perspective on centrality. Soc Networks 2006;28(4):466-84.
[75] Luce RD, Perry AD. A method of matrix analysis of group structure. Psychometrika 1949;14(2):95-116.
[76] Everett MG, Borgatti SP. Analyzing clique overlap. Connections 1998;21(1):49-61.
[77] Borgatti SP, Everett MG, Freeman LC. UCINET 6 for Windows: software for social network analysis (Version 6.102). Harvard, MA: Analytic Technologies; 2002.
[78] Borgatti SP. Identifying sets of key players in a social network. Comput Math Org Theory 2006;12(1):21-34.
[79] Smith JM, Halgin DS, Kidwell-Lopez V, Labianca G, Brass DJ, Borgatti SP. Power in politically charged networks. Soc Networks 2014;36:162-76.
[80] Emerson RM. Power-dependence relations. Am Sociol Rev 1962: 31-41.
[81] Emerson R. Exchange theory. Part II: exchange relations and networks. Sociol Theor Prog 1972;2:58-87.
[82] Markovsky B, Willer D, Patton T. Power relations in exchange networks. Am Sociol Rev 1988:220-36.
[83] Markovsky B, Skvoretz J, Willer D, Lovaglia MJ, Erger J. The seeds of weak power: an extension of network exchange theory. Am Sociol Rev 1993:197-209.