How Data Science Impacts Cybersecurity?

By Ishwarya Lakshmi.S.S. on ALTCOIN MAGAZINE

Ishwarya Lakshmi.S.S.
Published in
9 min readOct 18, 2019

--

Data science was earlier just a subset of computer science with mathematicians and statisticians doing calculations to find out alternate and null hypotheses. Later on, with the Information Technology revolution and with the onset of the Internet we started to have ample sources of data that can neither be ignored completely nor be used efficiently. This led to the arrival of big data concepts and data science as majors of computer science which can be mastered with sheer interest.

So what is the biggest asset of the 21st century?

DATA Is Undoubtedly The Biggest Asset Of The Century.

Data can do wonders if handled properly. It can reveal trends and patterns which can never else be found. Yes, you read it right. Data can do magic and reveal hidden insights.

“Data plays a pivotal rule in every domain especially in healthcare where data becomes life savior.”

Data has paved the way for machine learning algorithms that make our life easier. Machine learning, deep learning, and Artificial Intelligence have become part and parcel of our lives to an unimaginable extent.

So what comes as a threat to the asset?

Every asset needs to be safeguarded. Data being the biggest asset of the century it should be handled very preciously with the major threat of data being stolen or mishandled. In the IT era, it is really important to protect data from malicious intruders. If stolen by intruders with malicious intentions, data loss could even be life-threatening. Intruders can even use data to morph identity for illegal entry and so on. So it’s really important to prevent and protect against Intrusions which can be done using INTRUSION DETECTION SYSTEMS.

Relationship Between Data Science And Cyber Security

Every organization will have valuable and confidential data that needs to be protected at the cost of all odds. This might include network architecture, network transaction logs, frequency of normal network traffic and other related details. This information is vital in building the model using historical and current data. This helps in better performance of the machine learning model. Organizations can collect information on network-related intrusions like the difference in frequency of normal network traffic and abnormal network traffic. The time of event occurrence can also be logged which can later be fed to the models as parameters to detect outlier(different from the normal activity) to detect intrusions in the network. Thus data science enables organizations to develop data-driven tangible ecosystems to keep data more secure.

Blockchain- The Missing Link

What is blockchain all about?

Blockchain, as the name suggests, is a chain of blocks. Here the words ‘Block’ and ‘Chain’ do not stand for literal meaning. In this context ‘Block’ can be referred to as the digital pieces of information stored in the public database which is ‘Chain’.Block stores information on transactions like time, cost and date of the purchase. It also records information on who is related to the transaction along with the website name to create a unique username which is similar to ‘digital signature’.

“ Blockchain stores unique information called ‘Hash’ which enables us to identify and separate one block from another. “

The major applications of blockchain being cryptocurrencies like bitcoin.

This Photo by Unknown Author is licensed under CC BY-SA-NC

How does a block gets added to the chain?

For a block to be added in a chain four things must happen:

1)There should be a definite occurrence of the transaction.

2)The occurred transaction should be verified.

3)The information about the transaction should be stored in a block.

4)Then the block has to be uniquely hashed. Hash is also given to the most recently added block to the blockchain.

Once hashed, the block can be added to the blockchain which later becomes publicly available for anyone to look at. There is always access to view when and where the block was added and by whom to the blockchain.

How does blockchain become more secure?

Prior to everything, blocks are added in a chronological and linear pattern. New blocks are always added to the end and the position at which they are added to the chain is referred to as the ‘height’.Once a block gets added to the chain it is difficult to alter the contents of the block. It is difficult to alter as it changes the hash code of the other blocks too. Talking about security, if a hacker wants to manipulate the block in such a way one user is forced to pay twice the cost of his purchase then the hacker should edit the block which contains details about that particular transaction which would give him the tedious job of changing hash value of all the other blocks too which would consume enormous computing time and power. To be precise,

It is impossible to delete the blocks once added to the chain. This makes blockchain technology more secure.

The unexplored relationship between blockchain and data science:

Though both the technology uses data as its asset there is a vast difference between both blockchain is used for encrypting data whereas data science is used to analyze the data.

The Very First Disruption Of Data Science Is Analyzing Inaccessible Data, Privacy Issues, And Dirty Data(Duplicate And Incorrect Data).

How Can Blockchain Help Data Science?

There is a clear understanding that blockchain is helpful to validate the data ensuring its quality whereas data science is all trends and predictions from the humungous amount of data which is about the quantity. Blockchain changed the way of how data should be stored in a decentralized manner which was a technical gamechanger. It also integrates with other advanced technologies like cloud solutions and Artificial Intelligence.

Block Science And Data Science Go Hand In Hand In The Following Ways:

1)Ensuring data integrity- The data in a blockchain is verified and processed and far more reliable than any other source of data. It has transparency which ensures the details of origin and the path of data.

2)Protection from malicious attacks- Since the data is verified using many algorithms it close to impossible for malicious attacks in any of the blocks.

3)Helps in predictive analytics-Blockchain gives structured data with distributed nature which can also be used to gain useful insights with ease.

4)Real-time analysis- It offers real-time cross border transactions irrespective of geographic limitations.

5)Optimized data sharing- Data once analyzed and cleansed by one team if stored in a blockchain would make another team understand that there is no need to touch upon the cleansed data which optimizes time and workload.

Every technology is intertwined with one another but our major focus is data security and data science which also partially includes blockchain technology.

Branches Of Cyber Security That Can Use Data Science:

How does data science help in crime prediction?

There have always existed crimes like murders, robberies, and shootings in the world. Fighting crimes is one of the major concerns of the Government. There exists historical data in files regarding crimes and criminals in the police station which once updated in a database becomes one of the wealthiest sources of information to foresee trends and patterns.

For example: If there is a rise in the number of robberies in a particular area, then the police can analyze what is wrong in a particular place and deploy more patrols in that area.

The Indian Police force has started to gain interest in crime predictions using big data. They store and analyze the humungous volume of real-time data about people and their behavior.

Data about criminals who have taken parole can also lead to useful insights using analytics. The police force uses predictive analytics to find out the areas which are more prone to crime.

Interestingly Delhi police have partnered with ISRO to develop an analytical system called Crime Mapping, Analytics and Predictive System (CMAPS), which helps the Delhi police to ensure internal security, controlling crime, and maintaining law and order through analysis of data and patterns. Similarly, Jharkhand police are also trying to develop a crime analytics system in co-ordination with IIT-Ranchi to predict crime prone zones using the Machine Learning models.

Not just to protect against crimes the police can also use analytics for better crowd management during festival times as well to reduce road accidents.

Cybercrimes - Master In Generating Data

Coming to the other side of the page there exists another variety of crimes that are entirely based online and the crime itself leaves out lots of data to be investigated. Yes, you read it right cybercrimes are data warehouse themselves which can be used for predictive analytics.

The cybersecurity-related information available from big data techniques helps to drastically reduce the time taken to detect and resolve an attack which therefore enables cyber analysts to predict intrusions.

According to a CSO online report out of 90% of respondents of MeriTalk’s U.S. Government Survey have declared that they have seen a decline in cyber attacks after implementing big data concepts and 84% of respondents have blocked intrusions with the help of predictive analytics and big data implementation.

Big data will also help analysts to visualize cyberattacks by taking the complexity from various data sources and simplifying the patterns into visualizations.

Historical data plays a crucial role in framing a statistical baseline of what seems to be normal behavior which can then identify when behavior deviates from normal. It can create new possibilities for predictive and statistical models which in turn gives the ability to predict future events.

Case Study Of How Data Science Can Be Used To Detect DDOS Attacks Against Data Center?

STEP1: Understanding What Is A DDOS Attack?

What is a DDOS attack?

DDOS refers to the Distributed Denial of Service attack. It is a malicious attempt to usual network traffic of targeted server or network by overwhelming the server or network or its surrounding infrastructure by flooding with Internet traffic.

Understanding DDOS attack

STEP2: Solution To Tackle DDOS Attack

1) Analyze the correlation of information flow to the data center.

2) Figuring out an effective approach to detect anomaly using machine learning algorithms.

3) The Best suitable for this case is the clustering-based KNN algorithm(K-Nearest Neighbour algorithm).

4) It uses K-nearest neighbor traffic classification with correlation analysis to detect DDoS attacks.

STEP3: Outcome Of Applying Data Science

This approach exploits correlation information of training data to improve the classification accuracy and reduce the overhead caused by the density of training data.

So What Comes As The Outcome?

Data is very similar to water. It can be transformed to get the required information and patterns. But handling data inappropriately could be disastrous. Data is just information unless steps are taken to improve cybersecurity. Thus the key big data security solution is the architecture being able to automatically respond to the threats noticed in data and logs and also being able to have a high level of trust in the accuracy of the data. Contrary to the belief of many, big data will not quickly solve all the problems of the cybersecurity industry. It paves the way to detect anomalies and to discover hidden advanced attack vectors.

Google references:

Reference 1

Reference 2

--

--

Ishwarya Lakshmi.S.S.
The Dark Side

Creating stories from complex datasets | BI solutions Developer @Visual BI Solutions|Data Analyst | Python,SQL,Tableau,Power BI,Machine Learning