decompiler  1.0.0
Public Member Functions | Private Member Functions | Private Attributes | List of all members
ghidra::StringManagerUnicode Class Reference

An implementation of StringManager that understands terminated unicode strings. More...

#include <stringmanage.hh>

Inheritance diagram for ghidra::StringManagerUnicode:
ghidra::StringManager

Public Member Functions

 StringManagerUnicode (Architecture *g, int4 max)
 Constructor. More...
 
virtual const vector< uint1 > & getStringData (const Address &addr, Datatype *charType, bool &isTrunc)
 Retrieve string data at the given address as a UTF8 byte array. More...
 
bool writeUnicode (ostream &s, uint1 *buffer, int4 size, int4 charsize)
 Translate/copy unicode to UTF8. More...
 
- Public Member Functions inherited from ghidra::StringManager
 StringManager (int4 max)
 Constructor. More...
 
virtual ~StringManager (void)
 Destructor.
 
void clear (void)
 Clear out any cached strings.
 
bool isString (const Address &addr, Datatype *charType)
 
void encode (Encoder &encoder) const
 Encode cached strings to a stream. More...
 
void decode (Decoder &decoder)
 Restore string cache from a stream. More...
 

Private Member Functions

int4 checkCharacters (const uint1 *buf, int4 size, int4 charsize) const
 Make sure buffer has valid bounded set of unicode. More...
 

Private Attributes

Architectureglb
 Underlying architecture.
 
uint1 * testBuffer
 Temporary buffer for pulling in loadimage bytes.
 

Additional Inherited Members

- Static Public Member Functions inherited from ghidra::StringManager
static bool hasCharTerminator (const uint1 *buffer, int4 size, int4 charsize)
 Check for a unicode string terminator. More...
 
static int4 readUtf16 (const uint1 *buf, bool bigend)
 Read a UTF16 code point from a byte array. More...
 
static void writeUtf8 (ostream &s, int4 codepoint)
 Write unicode character to stream in UTF8 encoding. More...
 
static int4 getCodepoint (const uint1 *buf, int4 charsize, bool bigend, int4 &skip)
 Extract next unicode codepoint. More...
 
- Protected Attributes inherited from ghidra::StringManager
map< Address, StringDatastringMap
 Map from address to string data.
 
int4 maximumChars
 Maximum characters in a string before truncating.
 

Detailed Description

An implementation of StringManager that understands terminated unicode strings.

This class understands UTF8, UTF16, and UTF32 encodings. It reports a string if its sees a valid encoding that is null terminated.

Constructor & Destructor Documentation

◆ StringManagerUnicode()

ghidra::StringManagerUnicode::StringManagerUnicode ( Architecture g,
int4  max 
)

Constructor.

Parameters
gis the underlying architecture (and loadimage)
maxis the maximum number of bytes to allow in a decoded string

References glb, and testBuffer.

Member Function Documentation

◆ checkCharacters()

int4 ghidra::StringManagerUnicode::checkCharacters ( const uint1 *  buf,
int4  size,
int4  charsize 
) const
private

Make sure buffer has valid bounded set of unicode.

Check that the given buffer contains valid unicode. If the string is encoded in UTF8 or ASCII, we get (on average) a bit of check per character. For UTF16, the surrogate reserved area gives at least some check.

Parameters
bufis the byte array to check
sizeis the size of the buffer in bytes
charsizeis the UTF encoding (1=UTF8, 2=UTF16, 4=UTF32)
Returns
the number of characters or -1 if there is an invalid encoding

References ghidra::StringManager::getCodepoint(), glb, ghidra::Translate::isBigEndian(), and ghidra::Architecture::translate.

Referenced by getStringData().

◆ getStringData()

const vector< uint1 > & ghidra::StringManagerUnicode::getStringData ( const Address addr,
Datatype charType,
bool &  isTrunc 
)
virtual

Retrieve string data at the given address as a UTF8 byte array.

If the address does not represent string data, a zero length vector is returned. Otherwise, the string data is fetched, converted to a UTF8 encoding, cached and returned.

Parameters
addris the given address
charTypeis a character data-type indicating the encoding
isTruncpasses back whether the string is truncated
Returns
the byte array of UTF8 data

Implements ghidra::StringManager.

References ghidra::StringManager::StringData::byteData, checkCharacters(), ghidra::Datatype::getSize(), glb, ghidra::StringManager::hasCharTerminator(), ghidra::Datatype::isOpaqueString(), ghidra::StringManager::StringData::isTruncated, ghidra::Architecture::loader, ghidra::LoadImage::loadFill(), ghidra::StringManager::maximumChars, ghidra::StringManager::stringMap, testBuffer, and writeUnicode().

◆ writeUnicode()

bool ghidra::StringManagerUnicode::writeUnicode ( ostream &  s,
uint1 *  buffer,
int4  size,
int4  charsize 
)

Translate/copy unicode to UTF8.

Assume the buffer contains a null terminated unicode encoded string. Write the characters out (as UTF8) to the stream.

Parameters
sis the output stream
bufferis the given byte buffer
sizeis the number of bytes in the buffer
charsizespecifies the encoding (1=UTF8 2=UTF16 4=UTF32)
Returns
true if the byte array contains valid unicode

References ghidra::StringManager::getCodepoint(), glb, ghidra::Translate::isBigEndian(), ghidra::StringManager::maximumChars, ghidra::Architecture::translate, and ghidra::StringManager::writeUtf8().

Referenced by getStringData().


The documentation for this class was generated from the following files: