Starshine Intelligence System (KIS) is an all-in-one web intelligence monitoring system for law enforcement agencies & intelligence agencies. KIS will help clients mine massive information from social media (Twitter, Facebook, Youtube, Insatgram …) and traditional websites (Forum, Chat Room, News …), and turn them into actionable intelligence by automatic analysis and manual processing.
Starshine Intelligence System is developed based the world’s leading OSINT extraction technology, with the advantages of quick identification and full coverage. It enables users to monitor the entire web and identify important information and negative public opinions in a timely manner.
Functional description of automatic extraction sub-system
Functional description of content analysis sub-system
Functional description of interface presentation sub-system
Starshine Intelligence System is an information platform that is focused on the Internet to collect, analyze, summarize and monitor massive public opinions on the web in real time and identify the key information from them and notify related people immediately for immediate responses to be made in emergency and provide direct support to the right public opinion guidance and web users’ opinion collection.
The business process is shown below:
Compared to the current manual opinion monitoring, it is prominently advantageous:
|Indicator for comparison||Manual monitoring||Starshine Intelligence System|
|Target site||dozens||hundreds to thousands, even tens of thousands|
|Labor cost||login into all sites, manual access, manual copy and paste, tiresome||automatic access to network information, manual content viewing and analysis conducted within the private LAN|
|negative information identification||manual review and confirmation item by item required||manual confirmation based on automatic identification|
|Information storage||fragmented, errors unavoidable||accurate, full coverage, easy to track|
|Data storage||Word files, distributed, hard to manage||all stored in a large relational database and under centralized management|
|Monitoring report||based on manual statistics and estimation, insufficient data support||based on automated statistical analysis|
with both text and illustration, detailed statistical data support, daily, weekly and monthly report generation
|Monitoring effect||partial coverage, untimely|
unsatisfying, waste of human resource
|full coverage, timely, minutes to tens of minutes|
Monitoring target: all information related to the relevant city or province, especially negative information
Subsequent processing: contact the management of target websites, take reactive measures, publish responsive messages without delay
1. monitor relevant information contained in social websites like twitter/facebook.youtube, bbs, blogs, forums, news and search engines in real time;web2db knowlesys web2db
2. monitor chatting contents of focused QQ/wechat groups;
3. regularly make screenshots to monitor focused front pages and save special page evidences;
4. find out all the pages that reprint certain news;web2db knowlesys web2db
5. automatically classify information;
6. track all the information about a certain topic or author;
7. monitoring staff can pick and then classify information;
8. monitoring staff can effortlessly export and prepare daily and weekly graph-supported reports on public opinions based on their own work.
?Eliminate or minimize the adverse effect of occasional negative information on the image of relevant province/city and on the relevant provincial/municipal government officials;
?Identify and understand public opinion concerning the relevant province/city and resolve conflicts when they just emerge.
?Identify cyber content threats and cyber security issues (e.g. leaking of confidential information)
Starshine Intelligence System is composed of three sub-systems: extraction sub-system (extraction layer), analysis sub-system (analysis layer) and presentation sub-system (presentation layer). Their connections are shown below:
The network topology of Knowlesys Online Pubic Opinion System is shown below. It can be separately implemented in the Internet LAN and private LAN as needed.
IV. Functional description of extraction sub-system
The automatic extraction sub-system can collect data on all targeted websites automatically.
It can extract all news articles or topic posts, or contents of the latest topic post. It can also extract all replies to a topic post or contents of the latest reply. It can not only monitor a specified targeted websites but also monitor all website around the world without specifying target sites, or uses the two modes in combination.
The user could append the targeted websites like news websites, forum websites etc. The Google, Twitter and Facebook are built-in websites.
The automatic extraction sub-system can also monitor app-based chat room applications.
The all-round monitoring function of the automatic extraction sub-system is illustrated below:
The automatic extraction sub-system has the following distinct features:
- World’s leading automatic extraction function
Knowlesys’ web information extraction technology is leading in the world and is able to perform accurate extraction of any data on any web pages. Every day, Knowlesys provided extraction service from all kinds of websites to clients for a long time. To achieve this, an efficient and stable extraction platform is necessary.
- All targets can be monitored
Microblogs, News, BBS, blogs, public chat rooms, search engines, message boards, applications, electronic editions of newspapers and websites can be monitored in real-time.
- Thousands of news websites can be monitored without additional configuration.
With the built-in configuration for worldwide website monitoring, titles and texts can be automatically collected as long as the key words are typed in.
- The powerful multi-language processing function
allows automatic processing and save multiple languages such as Chinese, English, French, German, Japanese, Korean, Uyghur, Arabic.
- Smart article extraction
Article texts and titles can be directly extracted from the article-type web pages without additional configuration as well as release dates, with irrelevant contents like adverts, columns and copyright information removed automatically.
- All web page conditions are supported:
Popular Web 2.0 AJAX dynamic web site
Auto-login with user account and password
Processing next pages automatically
Automatic extraction and combination of article contents extending several pages
Automatic downloading of images contained in texts and various attachments
Original snapshot saving option for review
multiple Internet protocols supported: HTTP, HTTPS and FTP
support multiple web file formats: Do you know — all the feature provided by our system can be integrated to deal with thousands of types of web pages or data.
- Automatic deduplication function
For the same URL, each time only the latest uncollected article contents or replies are collected; the collected contents are ignored. Automatic deduplication can be applied to reprinted articles.
- Various built-in post-data processing functions
After data are collected from web pages, they can be further processed into finer data fields or integrated, replaced or summarized, for example, extraction of key words, street addresses, province/city names, postal codes, telephone numbers, fax numbers, e-mail addresses, QQ/MSN/Skype accounts and URLs.
- Automatic, unattended extraction around the clock
The system can either operate by schedule or on a 7/24 basis, at an interval as short as 1 minute.
- Users can add target websites themselves.
With the collection platform provided by the system, users can easily make visual analysis of targeted websites, configure task parameters and add them in the system. Also the user could modify, add and remove any monitoring targeted website freely.
V. Functional description of content analysis sub-system
The content analysis sub-system extracts the meta-information of contents and automatically classifies and clusters the contents in real time based on the key words specified by users.
The ultra-high speed key word extraction technology developed by Starshine can find the number of occurrences of 10,000 key words in a 30,000 words article in no more than 6.9 ms.
VI. Functional description of interface presentation sub-system
Its functional architecture is illustrated below:
The analysis & viewing sub-system has the following distinct features:
1. Working in collaboration
Different users view different contents, execute different operations and perform different duties.
2. Displaying article elements, and automatically prompting suspected negative information
For news and blogs, titles, texts, authors, release time and sources can be collected.
Key words are highlighted.
3. Displaying post elements and automatically prompting suspected negative information
For posts on BBS, titles, texts, posting time, view counts, number of comments and poster IP addresses can be collected.
Key words are highlighted.
4. Classifying and compiling
The contents collected can be filtered, classified, added with notes and complied for subsequent management and analysis.
5. Powerful search function
can perform precise search or fuzzy search, and can search by category or by source.
6. Supporting manual collection
In emergency or exceptional situations, information collected can be entered manually.
7. Anti-website restrictions
Collection of blocked foreign websites in China, collection of websites subject to restrictions to source IP and access frequency and automatic collection of proxy IP addresses are possible without further configuration.
8. SMS notification function
After key words are set, once one or more key words appear in the collected contents, the record can be sent to relevant short message receiver for unattended real-time monitoring.
9. Generating public opinion reports using the public opinion analysis engine
Hot topic list, number of posts, number of comments, number of authors,
sensitive topic list
automatic key word extraction
all types of trend charts
news-type report: Title, source, release time, content, clicks, commentator, comment content, number of comments, etc.
BBS-type report: Post title, poster, posting time, content, reply content, number of replies, etc.
The system is mainly applied to governments and PR departments of large and medium-sized companies.
Due to the complexity of the Internet, the implementation of the Knowlesys Intelligence System requires clients’ cooperation.
We provide the following implementation services for users’ requirements:
|1||Turn-key project||Provide a full package of software and documentations of Knowlesys Online Pubic Opinion Monitoring System;|
provide the acquisition configuration files of N websites specified by users.
Ensure real-time monitoring of the contents on target websites after the system is launched.
|2||Training||E-training or training at clients’ premises|
|3||Subsequent services||Provide configuration parameter files after the update of target websites, |
revisit and respond to technical inquiries, answer questions on a regular basis,
provide remote O&M services to lower the skill requirements for clients’ supporting staff
|4||Technical Support||Answer questions from clients via EmailQ/MSN/Skype,|
give technical support