Skip to content
Surf Wiki
Save to docs
general/link-analysis

From Surf Wiki (app.surf) — the open knowledge base

Link analysis

Investigative data-analysis technique


Investigative data-analysis technique

In network theory, link analysis is a data-analysis technique used to evaluate relationships between nodes. Relationships may be identified among various types of nodes, including organizations, people and transactions. Link analysis has been used for investigation of criminal activity (fraud, counterterrorism, and intelligence), computer security analysis, search engine optimization, market research, medical research, and art.

Knowledge discovery

Knowledge discovery is an iterative and interactive process used to identify, analyze and visualize patterns in data. Network analysis, link analysis and social network analysis are all methods of knowledge discovery, each a corresponding subset of the prior method. Most knowledge discovery methods follow these steps (at the highest level):

  1. Data processing
  2. Transformation
  3. Analysis
  4. Visualization

Data gathering and processing requires access to data and has several inherent issues, including information overload and data errors. Once data is collected, it will need to be transformed into a format that can be effectively used by both human and computer analyzers. Manual or computer-generated visualizations tools may be mapped from the data, including network charts. Several algorithms exist to help with analysis of data – Dijkstra's algorithm, breadth-first search, and depth-first search.

Link analysis focuses on analysis of relationships among nodes through visualization methods (network charts, association matrix). Here is an example of the relationships that may be mapped for crime investigations:

Relationship/NetworkData Sources
1. TrustPrior contacts in family, neighborhood, school, military, club or organization. Public and court records. Data may only be available in suspect's native country.
2. TaskLogs and records of phone calls, electronic mail, chat rooms, instant messages, Web site visits. Travel records. Human intelligence: observation of meetings and attendance at common events.
3. Money & ResourcesBank account and money transfer records. Pattern and location of credit card use. Prior court records. Human intelligence: observation of visits to alternate banking resources such as Hawala.
4. Strategy & GoalsWeb sites. Videos and encrypted disks delivered by courier. Travel records. Human intelligence: observation of meetings and attendance at common events.

Link analysis is used for 3 primary purposes:

  1. Find matches in data for known patterns of interest;
  2. Find anomalies where known patterns are violated;
  3. Discover new patterns of interest (social network analysis, data mining).

History

Klerks categorized link analysis tools into 3 generations. The first generation was introduced in 1975 as the Anacpapa Chart of Harper and Harris. This method requires that a domain expert review data files, identify associations by constructing an association matrix, create a link chart for visualization and finally analyze the network chart to identify patterns of interest. This method requires extensive domain knowledge and is extremely time-consuming when reviewing vast amounts of data.[[File:Association Matrix.png|thumb|Association Matrix]]

In addition to the association matrix, the activities matrix can be used to produce actionable information, which has practical value and use to law-enforcement. The activities matrix, as the term might imply, centers on the actions and activities of people with respect to locations. Whereas the association matrix focuses on the relationships between people, organizations, and/or properties. The distinction between these two types of matrices, while minor, is nonetheless significant in terms of the output of the analysis completed or rendered.

Second generation tools consist of automatic graphics-based analysis tools such as IBM i2 Analyst's Notebook, Netmap, ClueMaker and Watson. These tools offer the ability to automate the construction and updates of the link chart once an association matrix is manually created, however, analysis of the resulting charts and graphs still requires an expert with extensive domain knowledge.

The third generation of link-analysis tools like DataWalk allow the automatic visualization of linkages between elements in a data set, that can then serve as the canvas for further exploration or manual updates.

Applications

  • FBI Violent Criminal Apprehension Program (ViCAP)
  • Iowa State Sex Crimes Analysis System
  • Minnesota State Sex Crimes Analysis System (MIN/SCAP)
  • Washington State Homicide Investigation Tracking System (HITS)
  • New York State Homicide Investigation & Lead Tracking (HALT)
  • New Jersey Homicide Evaluation & Assessment Tracking (HEAT)
  • Pennsylvania State ATAC Program.
  • Violent Crime Linkage Analysis System (ViCLAS)

Proposed solutions

There are four categories of proposed link analysis solutions:

  1. Heuristic-based
  2. Template-based
  3. Similarity-based
  4. Statistical

Heuristic-based tools utilize decision rules that are distilled from expert knowledge using structured data. Template-based tools employ Natural Language Processing (NLP) to extract details from unstructured data that are matched to pre-defined templates. Similarity-based approaches use weighted scoring to compare attributes and identify potential links. Statistical approaches identify potential links based on lexical statistics.

CrimeNet explorer

J.J. Xu and H. Chen propose a framework for automated network analysis and visualization called CrimeNet Explorer. This framework includes the following elements:

  • Network Creation through a concept space approach that uses "co-occurrence weight to measure the frequency with which two words or phrases appear in the same document. The more frequently two words or phrases appear together, the more likely it will be that they are related".
  • Network Partition using "hierarchical clustering to partition a network into subgroups based on relational strength".
  • Structural Analysis through "three centrality measures (degree, betweenness, and closeness) to identify central members in a given subgroup. CrimeNet Explorer employed Dijkstra's shortest-path algorithm to calculate the betweenness and closeness from a single node to all other nodes in the subgroup.
  • Network Visualization using Torgerson's metric multidimensional scaling (MDS) algorithm.

References

References

  1. Inc., The Tor Project. "Tor Project: Overview".
  2. Ahonen, H., [http://www.cs.helsinki.fi/u/hahonen/features.txt Features of Knowledge Discovery Systems] {{Webarchive. link. (2012-12-08 .)
  3. Krebs, V. E. 2001, [http://vlado.fmf.uni-lj.si/pub/networks/doc/Seminar/Krebs.pdf Mapping networks of terrorist cells] {{webarchive. link. (2011-07-20 , Connections 24, 43–52.)
  4. link. (2017-05-17 , Air Force Research Laboratory Information Directorate, Rome Research Site, Rome, New York, September 2004.)
  5. Klerks, P.. (2001). "The network paradigm applied to criminal organizations: Theoretical nitpicking or a relevant doctrine for investigators? Recent developments in the Netherlands". Connections.
  6. Harper and Harris, The Analysis of Criminal Intelligence, Human Factors and Ergonomics Society Annual Meeting Proceedings, 19(2), 1975, pp. 232-238.
  7. Pike, John. "FMI 3-07.22 Appendix F Intelligence Analysis Tools and Indicators".
  8. [https://rdl.train.army.mil/catalog/view/100.ATSC/41449AB4-E8E0-46C4-8443-E4276B6F0481-1274576841878/3-24/appb.htm Social Network Analysis and Other Analytical Tools] {{webarchive. link. (2014-03-08)
  9. MSFC, Rebecca Whitaker. (10 July 2009). "Aeronautics Educator Guide - Activity Matrices".
  10. [https://rdl.train.army.mil/catalog/view/100.ATSC/0EF89CA1-2680-4782-B103-D2F5DC941188-1274309335668/7-98-1/chap2l6.htm Personality/Activity Matrix] {{webarchive. link. (2014-03-08)
  11. "Homicide Investigation Tracking System (HITS)".
  12. "New Jersey State Police - Investigations Section".
  13. "Violent Crime Linkage System (ViCLAS)".
  14. Palshikar, G. K., [http://www.intelligententerprise.com//020528/509feat3_1.jhtml The Hidden Truth] {{Webarchive. link. (2008-05-15 , Intelligent Enterprise, May 2002.)
  15. Bolton, R. J. & Hand, D. J., Statistical Fraud Detection: A Review, Statistical Science, 2002, 17(3), pp. 235-255.
  16. Sparrow M.K. 1991. Network Vulnerabilities and Strategic Intelligence in Law Enforcement', [[International Journal of Intelligence and CounterIntelligence]] Vol. 5 #3.
  17. Friedrich Waismann, Verifiability (1945), p.2.
  18. Lyons, D., [http://ssrn.com/abstract=212328 Open Texture and the Possibility of Legal Interpretation (2000)].
  19. McGrath, C., Blythe, J., Krackhardt, D., [http://www.andrew.cmu.edu/user/cm3t/groups.html Seeing Groups in Graph Layouts] {{Webarchive. link. (2013-10-03 .)
  20. Picarelli, J. T., [http://kdl.cs.umass.edu/events/aila1998/picarelli.ps Transnational Threat Indications and Warning: The Utility of Network Analysis, Military and Intelligence Analysis Group] {{Webarchive. link. (2016-03-11 .)
  21. Schroeder et al., Automated Criminal Link Analysis Based on Domain Knowledge, Journal of the American Society for Information Science and Technology, 58:6 (842), 2007.
  22. Xu, J.J. & Chen, H., CrimeNet Explorer: A Framework for Criminal Network Knowledge Discovery, ACM Transactions on Information Systems, 23(2), April 2005, pp. 201-226.
Info: Wikipedia Source

This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page.

Want to explore this topic further?

Ask Mako anything about Link analysis — get instant answers, deeper analysis, and related topics.

Research with Mako

Free with your Surf account

Content sourced from Wikipedia, available under CC BY-SA 4.0.

This content may have been generated or modified by AI. CloudSurf Software LLC is not responsible for the accuracy, completeness, or reliability of AI-generated content. Always verify important information from primary sources.

Report