Feature Extraction Of Protest Demonstration On Lihkg Discussion Forum

Ao Shen: Hi, I’m [inaudible] Ao Shen from the University of Hong Kong. This research is done by Dr. Dr K P Chow and me, and our title is Feature Extraction of Protest Demonstration on the Lihkg Discussion Forum.  Here is the Agenda of this presentation.

We will first overview the Lihkg data set then talk about the motivation of this research. Then it is the experiment multi-label classification. Finally, we’ll have a conclusion and talk about the future work. 

In today’s cyber space many cyber crimes leaves traces in social media platforms, such as Facebook and online discussion forum and potential criminal activities may be posted in social media. In this way the online platforms are the places for detecting potential crimes and obtaining traces. The Lihkg discussion forum is a well-known multi-category discussion forum in Hong Kong. Some activity organizers will post notification messages in advance and their supporters or regular users can immediately read, participate, and post their comments. 

From August, 2019 most of discussions on the forum refers to public gathering and demonstrations. The demonstrations in Hong Kong over these past months have astonished the world, but whether a particular activity constitutes a criminal offense depends on the local laws and regulations, but it is still important in police resource management to identify the features of an unregistered demonstration. So this study established a model to automatically identify the features of the demonstrations from public posts. 

We divided our experiment into four parts: data collection and labeling, the post vectorization, multi-label classification model and finally, it’s the experiment result. The first step is data collection. We collect the post on the Current Affairs section from 2019 August to October. Most of the posts are in traditional Chinese or Cantonese and the centers in this bracket is the English translation for this presentation. 


Get The Latest DFIR News

Join the Forensic Focus newsletter for the best DFIR articles in your inbox every month.


Unsubscribe any time. We respect your privacy - read our privacy policy.

About data labeling, totally there are seven labels: strikes, arson, wounding, sit-ins, parades riots and conflict, and we’ll randomly select 1,272 posts to label the data like this. Like the table shown here. 

The second part is the post vectorization. We first use a Doc2Vector master to transfer the textual post to vector metrics, and then use the Pearson’s correlation coefficient to calculate the coefficient metrics to find the relationship between the post and the seven labels. And then this matrix is the input of the multi-label classification model. For the classification model we use MLP neural network to build the model. It is a class of the feedforward artificial neural network and I show the structure and the information of the model here. The accuracy precision recall, and F1 score as shown in this slide. And from the weighted average value here, we can see that the MLP has a good performance and reliable classifying to the corresponding labels.

I think the label data is unbalanced. It’s still difficult to identify some minority features. We can also output some high correlation topics appear in each class. So based on these topics, it is feasible to describe the subject and understand the urgency of the corresponding demonstration.

To make a conclusion, the Floyd demo demonstration will not only disrupt the business and traffic, but also affect social security and threaten the social order. We use MLP multi-label classification methods to understand the specific forms and the characteristic of the harmful demonstration on online forums so that it can provide the theoretical support and applications and posts to assign resources and have a proper monitoring of the activity schedule. The future research will attempt to study the relationship between the subjects and discourses on a deeper level and that’s the end of this presentation. And thank you for listening.

Leave a Comment

Latest Videos

In this episode of the Forensic Focus podcast, Desi and Si discuss different online programming courses and what they think about the popular platform, Udemy. They also talk about Flipper, Dev boards, and Raspberry Pi, and delve into the fascinating phenomenon of running the classic game Doom on unlikely devices.

Throughout the episode, Desi and Si share their digital forensics expertise, referencing some of the cases they have been working on and highlighting particular methodologies and technologies that have an impact on cybersecurity.

Show Notes:

100 Days of Code: The Complete Python Pro Bootcamp for 2023 - https://www.udemy.com/course/100-days-of-code/

Domestika - https://www.domestika.org/en

MIT OpenCourseWare - https://www.youtube.com/@mitocw 

MasterClass - https://www.masterclass.com/

Raspberry Pi 400 Complete Kit - https://core-electronics.com.au/raspberry-pi-400-kit.html

Flipper Discord - https://discord.com/invite/flipper

Flipper Zero - https://flipperzero.one/

This Programmer Figured Out How to Play Doom on a Pregnancy Test - https://www.popularmechanics.com/science/a33957256/this-programmer-figured-out-how-to-play-doom-on-a-pregnancy-test/

Here’s a dude playing Doom Eternal on his fridge - https://www.polygon.com/2020/10/13/21514933/doom-eternal-refrigerator-door-samsung-smart-refrigerator-xbox-game-pass-richard-mallard

Doom hacker gets Doom running in Doom - https://www.pcgamer.com/doom-hacker-gets-doom-running-in-doom/

Doom Running On A Calculator Powered By Old Potatoes - https://kotaku.com/doom-running-on-a-calculator-powered-by-old-potatoes-1845374069

GoldenEra - https://www.imdb.com/title/tt11753760/

Racing the Beam - https://en.wikipedia.org/wiki/Racing_the_Beam

High Score (TV series) - https://en.wikipedia.org/wiki/High_Score_(TV_series)

Microcontroller Courses (Udemy) - https://www.udemy.com/topic/microcontroller/

The story of Final Fantasy XIV’s renegade do-good modders - https://www.pcgamesn.com/final-fantasy-xiv/ffxiv-modders-renegade-do-gooders

Logical fallacies - https://yourlogicalfallacyis.com/

In this episode of the Forensic Focus podcast, Desi and Si discuss different online programming courses and what they think about the popular platform, Udemy. They also talk about Flipper, Dev boards, and Raspberry Pi, and delve into the fascinating phenomenon of running the classic game Doom on unlikely devices.

Throughout the episode, Desi and Si share their digital forensics expertise, referencing some of the cases they have been working on and highlighting particular methodologies and technologies that have an impact on cybersecurity.

Show Notes:

100 Days of Code: The Complete Python Pro Bootcamp for 2023 - https://www.udemy.com/course/100-days-of-code/

Domestika - https://www.domestika.org/en

MIT OpenCourseWare - https://www.youtube.com/@mitocw

MasterClass - https://www.masterclass.com/

Raspberry Pi 400 Complete Kit - https://core-electronics.com.au/raspberry-pi-400-kit.html

Flipper Discord - https://discord.com/invite/flipper

Flipper Zero - https://flipperzero.one/

This Programmer Figured Out How to Play Doom on a Pregnancy Test - https://www.popularmechanics.com/science/a33957256/this-programmer-figured-out-how-to-play-doom-on-a-pregnancy-test/

Here’s a dude playing Doom Eternal on his fridge - https://www.polygon.com/2020/10/13/21514933/doom-eternal-refrigerator-door-samsung-smart-refrigerator-xbox-game-pass-richard-mallard

Doom hacker gets Doom running in Doom - https://www.pcgamer.com/doom-hacker-gets-doom-running-in-doom/

Doom Running On A Calculator Powered By Old Potatoes - https://kotaku.com/doom-running-on-a-calculator-powered-by-old-potatoes-1845374069

GoldenEra - https://www.imdb.com/title/tt11753760/

Racing the Beam - https://en.wikipedia.org/wiki/Racing_the_Beam

High Score (TV series) - https://en.wikipedia.org/wiki/High_Score_(TV_series)

Microcontroller Courses (Udemy) - https://www.udemy.com/topic/microcontroller/

The story of Final Fantasy XIV’s renegade do-good modders - https://www.pcgamesn.com/final-fantasy-xiv/ffxiv-modders-renegade-do-gooders

Logical fallacies - https://yourlogicalfallacyis.com/

YouTube Video UCQajlJPesqmyWJDN52AZI4Q_5f72B6DD5wk

Programming Languages, Flipper And Gaming

Forensic Focus 66 views 24th May 2023 11:43 am

In this episode of the Forensic Focus podcast, Si and Desi talk to Mackenzie Jackson, Developer Advocate at Git Guardian. 

Mackenzie discusses the problem of hard-coded and leaked credentials in Git repositories, the task of scanning Git repositories for leaked credentials, and how that’s helped by the setup of GitHub and Git. 

He also looks at some public and private cases of security breaches through Git repositories and recommends tools you can use to combat attackers on Git. 

Show Notes:

Toyota Suffered a Data Breach by Accidentally Exposing A Secret Key Publicly On GitHub (GitGuardian) - https://blog.gitguardian.com/toyota-accidently-exposed-a-secret-key-publicly-on-github-for-five-years/

GitHub.com rotates its exposed private SSH key (Bleeping Computer) - https://www.bleepingcomputer.com/news/security/githubcom-rotates-its-exposed-private-ssh-key/

Conpago - https://www.conpago.com.au/

Source Code as a Vulnerability - A Deep Dive into the Real Security Threats From the Twitch Leak (GitGuardian) - https://blog.gitguardian.com/security-threats-from-the-twitch-leak/

Teenagers Leveraging Insider Threats: Lapsus$ Hacker Group (Forbes) - https://www.forbes.com/sites/emilsayegh/2023/03/15/teenagers-leveraging-insider-threats-lapsus-hacker-group

Lapsus$: Oxford teen accused of being multi-millionaire cyber-criminal (BBC) - https://www.bbc.co.uk/news/technology-60864283

Dynamic Secrets (HashiCorp) - https://developer.hashicorp.com/vault/tutorials/getting-started/getting-started-dynamic-secrets

Crappy code, crappy Copilot. GitHub Copilot is writing vulnerable code and it could be your fault (GitGuardian) - https://blog.gitguardian.com/crappy-code-crappy-copilot/

trufflesecurity/trufflehog (GitHub) - https://github.com/trufflesecurity/trufflehog

gitleaks/gitleaks (GitHub) - https://github.com/gitleaks/gitleaks

Git (Wikipedia) - https://en.wikipedia.org/wiki/Git

awslabs/git-secrets (GitHub) - https://github.com/awslabs/git-secrets

In this episode of the Forensic Focus podcast, Si and Desi talk to Mackenzie Jackson, Developer Advocate at Git Guardian.

Mackenzie discusses the problem of hard-coded and leaked credentials in Git repositories, the task of scanning Git repositories for leaked credentials, and how that’s helped by the setup of GitHub and Git.

He also looks at some public and private cases of security breaches through Git repositories and recommends tools you can use to combat attackers on Git.

Show Notes:

Toyota Suffered a Data Breach by Accidentally Exposing A Secret Key Publicly On GitHub (GitGuardian) - https://blog.gitguardian.com/toyota-accidently-exposed-a-secret-key-publicly-on-github-for-five-years/

GitHub.com rotates its exposed private SSH key (Bleeping Computer) - https://www.bleepingcomputer.com/news/security/githubcom-rotates-its-exposed-private-ssh-key/

Conpago - https://www.conpago.com.au/

Source Code as a Vulnerability - A Deep Dive into the Real Security Threats From the Twitch Leak (GitGuardian) - https://blog.gitguardian.com/security-threats-from-the-twitch-leak/

Teenagers Leveraging Insider Threats: Lapsus$ Hacker Group (Forbes) - https://www.forbes.com/sites/emilsayegh/2023/03/15/teenagers-leveraging-insider-threats-lapsus-hacker-group

Lapsus$: Oxford teen accused of being multi-millionaire cyber-criminal (BBC) - https://www.bbc.co.uk/news/technology-60864283

Dynamic Secrets (HashiCorp) - https://developer.hashicorp.com/vault/tutorials/getting-started/getting-started-dynamic-secrets

Crappy code, crappy Copilot. GitHub Copilot is writing vulnerable code and it could be your fault (GitGuardian) - https://blog.gitguardian.com/crappy-code-crappy-copilot/

trufflesecurity/trufflehog (GitHub) - https://github.com/trufflesecurity/trufflehog

gitleaks/gitleaks (GitHub) - https://github.com/gitleaks/gitleaks

Git (Wikipedia) - https://en.wikipedia.org/wiki/Git

awslabs/git-secrets (GitHub) - https://github.com/awslabs/git-secrets

YouTube Video UCQajlJPesqmyWJDN52AZI4Q_BX15Z_xF8mA

Preventing Data Leaks With Git Guardian

Forensic Focus 72 views 3rd May 2023 11:07 am

Latest Articles

Share to...