Repository logo
 

Transformations for linguistic steganography


No Thumbnail Available

Type

Thesis

Change log

Authors

Chang, Ching-Yun 

Abstract

Linguistic steganography is a form of covert communication using natural language to conceal the existence of the hidden message. It is usually achieved by systematically making changes to a cover text, such that the manipulations, namely the very act of communication, are undetectable to an outside observer (human or computer). In this thesis, we explore three possible linguistic transformations — lexical substitution, adjective deletion and word ordering — which are able to generate alternatives for a cover text. For each transformation, we propose different transformation checkers in order to certify the naturalness of a modified sentence.

Our lexical substitution checkers are based on contextual n-gram counts and the αskew divergence of those counts derived from the Google n-gram corpus. For adjective deletion, we propose an n-gram count method similar to the substitution n-gram checker and a support vector machine classifier using n-gram counts and other measures to classify deletable and undeletable adjectives in context. As for word ordering, we train a maximum entropy classifier using some syntactic features to determine the naturalness of a sentence permutation.

The proposed transformation checkers were evaluated by human judged data, and the evaluation results are presented using precision and recall curves. The precision and recall of a transformation checker can be interpreted as the security level and the embedding capacity of the stegosystem, respectively. The results show that the proposed transformation checkers can provide a confident security level and reasonable embedding capacity for the steganography application.

In addition to the transformation checkers, we demonstrate possible data encoding methods for each of the linguistic transformations. For lexical substitution, we propose a novel encoding method based on vertex colouring. For adjective deletion, we not only illustrate its usage in the steganography application, but also show that the adjective deletion technique can be applied to a secret sharing scheme, where the secret message is encoded in two different versions of the carrier text, with different adjectives deleted in each version. For word ordering, we propose a ranking-based encoding method and also show how the technique can be integrated into existing translation based embedding methods.

Description

Date

2013-07-01

Advisors

Clark, Stephen

Keywords

Linguistic steganography

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge