Compare commits

...

2 Commits

Author SHA1 Message Date
Brandon Turner 6c382a72b3 Land #3112, revert metasm update
This is a one-time fixup for a regression in a Pro feature.  It should
not be merged to master.  We will be fixing the root cause of this soon,
rendering this merge unnecessary.
2014-03-17 15:40:45 -05:00
William Vu 171cccb63d Revert #3013, Metasm update
This reverts commit ee0aa20955, reversing
changes made to 3c2eb29762.

[SeeRM #8775]
2014-03-17 10:57:27 -05:00
208 changed files with 8459 additions and 18023 deletions
+5
View File
@@ -1,2 +1,7 @@
# Load a slightly tweaked METASM stub
require 'metasm/metasm'
# Manually load the classes we need from METASM
require 'metasm/ia32'
require 'metasm/mips'
require 'metasm/exe_format/shellcode'
+2
View File
@@ -0,0 +1,2 @@
repo: a1be49ad3727a7dab9202f848ad39b5674e1aada
node: 7ec6509ea16231e365fffc91014755c810c27536
+22 -30
View File
@@ -21,10 +21,6 @@ Ready-to-use scripts can be found in the samples/ subdirectory, check the
comments in the scripts headers. You can also try the --help argument if
you're feeling lucky.
For more information, check the doc/ subdirectory. The text files can be
compiled to html using the misc/txt2html.rb script.
Here is a short overview of the Metasm internals.
@@ -171,8 +167,8 @@ You can encode/decode an ExeFormat (ie decode sections, imports, headers etc)
Constructor: ExeFormat.decode_file(str), ExeFormat.decode_file_header(str)
Methods: ExeFormat#encode_file(filename), ExeFormat#encode_string
PE and ELF files have a LoadedPE/LoadedELF counterpart, that are able to work
with memory-mmaped versions of those formats (e.g. to debug running
PE and ELF files have a LoadedPE/LoadedELF counterpart, that is able to work
with memory-mmaped versions of those formats (e.g. to debugging running
processes)
@@ -202,31 +198,27 @@ disassembly/patching easily (using LoadedPE/LoadedELF as ExeFormat)
Debugging:
Metasm includes a few interfaces to handle debugging.
Metasm includes a few interfaces to allow live debugging.
The WinOS and LinOS classes offer access to the underlying OS processes (e.g.
OS.current.find_process('foobar') will retrieve a running process with foobar
in its filename ; then process.mem can be used to access its memory.)
The Windows and Linux low-level debugging APIs have a basic ruby interface
(PTrace and WinAPI) ; which are used by the unified high-end Debugger class.
Remote debugging is supported through the GDB server wire protocol.
High-level debuggers can be created with the following ruby line:
Metasm::OS.current.create_debugger('foo')
Only one kind of host debugger class can exist at a time ; to debug multiple
processes, attach to other processes using the existing class. This is due
to the way the OS debugging API works on Windows and Linux.
The low-level backends are defined in the os/ subdirectory, the front-end is
defined in debug.rb.
The Windows and Linux debugging APIs (x86 only) have a basic ruby interface
(PTrace32, extended in samples/rubstop.rb ; and WinDBG, a simple mapping of the
windows debugging API) ; those will be more worked on/integrated in the future.
A linux console debugging interface is available in samples/lindebug.rb ; it
uses a (simplified) SoftICE-like look and feel.
It can talk to a gdb-server socket ; use a [udp:]<host:port> target.
uses a SoftICE-like look and feel.
This interface can talk to a gdb-server through samples/gdbclient.rb ; use
[udp:]<host:port> as target.
The disassembler-gui sample allow live process interaction when using as
target 'live:<pid or part of program name>'.
The disassembler scripts allow live process interaction by using as target
'live:<pid or part of filename>'.
A generic debugging interface is available, it is defined in metasm/os/main.rb
It may be accessed using the Metasm::OS.current.create_debugger('foo')
It can be viewed in action using the GUI and 'open live' target.
C Parser:
@@ -244,11 +236,7 @@ It handles all the constructs i am aware of, except hex floats:
- __int8 etc native types
- Label addresses (&&label)
Also note that all those things are parsed, but most of them will fail to
compile on the Ia32/X64 backend (the only one implemented so far.)
Parsing C files should be done using an existing ExeFormat, with the
parse_c_file method. This ensures that format-specific macros/ABI are correctly
defined (ex: size of the 'long' type, ABI to pass parameters to functions, etc)
compile on the Ia32 backend (the only one implemented so far.)
When you parse a C String using C::Parser.parse(text), you receive a Parser
object. It holds a #toplevel field, which is a C::Block, which holds #structs,
@@ -261,11 +249,15 @@ CExpressions...)
A C::Parser may be #precompiled to transform it into a simplified version that
is easier to compile: typedefs are removed, control sequences are transformed
into 'if (XX) goto YY;' etc.
in if () goto ; etc.
To compile a C program, use PE/ELF.compile_c, that will create a C::Parser with
exe-specific macros defined (eg __PE__ or __ELF__).
The prefered way to create a C::Parser is to initialize it with a CPU and the
desired ExeFormat, so that it is
correctly initialized (eg type sizes: is long 4 or 8 bytes? etc) ; and
may define preprocessor macros needed to correctly parse standard headers.
Vendor-specific headers may need to use either #pragma prepare_visualstudio
(to parse the Microsoft Visual Studio headers) or prepare_gcc (for gcc), the
latter may be auto-detected (or may not).
+7 -7
View File
@@ -2,14 +2,13 @@ List of TODO items, by section, in random order
Ia32
emu fpu
AVX support
add all sse2 instrs
realmode
X86_64
decompiler
CPU
Arm
Sparc
Cell
@@ -27,14 +26,14 @@ Assembler
Disasm
DecodedData
Exe decoding generate decodeddata ?
Function variable names using stack analysis + ExpressionString
Function-local namespace (esp+12 -> esp+var_42)
Fix thunk detection (thunk: mov ecx, 42 jmp [iat_thiscall] is not a thunk)
Test with ET_REL style exe
Store stuff out of mem (to handle big binaries)
Better :default usage
good on call eax, but not on <600k instrs> ret
use binary personality ? (uses call vs uses pushret..)
Improve 'backtrace => patch di.instr.args'
Improve backtrace -> patch di.instr.args exprs
path-specific backtracking ( foo: call a ; a: jmp retloc ; bar: call b ; b: jmp retloc ; retloc: ret ; call foo ; ret : last ret trackback should only reach a:)
Decode pseudo/macro-instrs (mips 'li')
Deoptimizer (instr reordering for readability)
@@ -70,7 +69,6 @@ Decompiler
Handle/hide compiler-generated stuff (getip, stack cookie setup/check..)
Handle call 1f ; 1: pop eax
More user control (force/forbid register arg, return type, etc)
Preserve C decompiled line association to range of asm decoded addrs
Debugger
OSX
@@ -83,6 +81,7 @@ Debugger
Remote debugging (small standalone C client)
Support dbghelp.dll (ms symbol server info)
Support debugee function call (gdb 'call')
Manipulate memory through C struct casts
ExeFormat
Handle minor editing without decode/reencode (eg patch ELF entrypoint)
@@ -106,9 +105,10 @@ GUI
show breakpoints
show jump direction from current flag values
have a console frontend
better graph positionning fallback
zoom font when zooming graph
text selection
copy/paste, selection
map (part of) the binary & debug it (map a PE on a linux host & run it)
Ruby
write a fast ruby-like interpreter
compile ruby AST to native optimized code
+146
View File
@@ -0,0 +1,146 @@
Metasm source code organisation
===============================
The metasm source code takes advantage of the ruby language facilities,
which allows splitting the definition of a single class in multiple files.
Each file in the source tree holds code related to a particular feature of
the framework.
Directories
-----------
The top-level directories are :
* `doc/`: this documentation
* `metasm/`: the framework core
* `samples/`: a set of sample scripts showing various functionnalities of the framework
* `tests/`: a few unit tests (too few..)
* `misc/`: misc ruby scripts, not directly related to metasm
The core
--------
The `metasm/` directory holds most of the code of the framework, along with the
main `metasm.rb` file in the top directory.
The top-level `metasm.rb` has code to load parts of the framework source on demand
in the ruby interpreter, which is implemented with ruby's <const_missing.txt>
Executable formats
##################
The `exe_format/` subdirectory contains the implementations of the various
binary file formats supported in the framework.
Three files have a special meaning here:
* `main.rb`: it defines the <core/ExeFormat.txt> class
* `serialstruct.rb`: here you'll find the definitions of <core/SerialStruct.txt>
* `autoexe.rb`: the implementation of <core/AutoExe.txt>, which allows the recognition of arbitrary files from their binary signature.
The `main.rb` file is included in all other formats, as all file classes
are subclasses of `ExeFormat`.
The `serialstruct.rb` implements a helper class to ease the description of
binary structures, and generate parsing/encoding functions for those.
All other files implement a specific file format handler. The bigger files
(`ELF` and `PE/COFF`) are split between the parsing/encoding functions and
decoding/disassembly.
CPUs
####
All supported architectures have a dedicated subdirectory, and a helper file
that will simply include all the arch-specific files.
All those files will contribute to add functions to the same class implementing
the CPU interface. Not all CPUs implement all those features. They are:
* `main.rb`: inner classes definitions (for registers etc), generic functions
* `opcodes.rb`: initializes the opcode list for the architecture
* `encode.rb`: methods to encode instructions
* `decode.rb`: methods to decode/emulate instructions
* `parse.rb`: methods to parse asm instructions from a source file
* `render.rb`: methods to output an instruction to a string
* `compile_c.rb`: the C compiler implementation
* `decompile.rb`: the arch-specific part of the generic decompiler
* `debug.rb`: arch-specific information used when debugging target of this architecture
In some cases the files are small enough to be all merged into the `main.rb` file.
Operating systems
#################
The `os/` subdirectory holds the code used to abstract an operating systems.
The files here define an API allowing to enumerate running processes, and interact
with them in various ways. The <core/Debugger.txt> class and subclasses are
defined there.
Those files also holds the list of known functions and in which system libraries
they can be found (see <core/WindowsExports.txt> or <core/GNUExports.txt>), which
are used when linking executable files.
Graphical user-interface
########################
The `gui/` subdirectory contains the code needed by the metasm graphical user-interfaces.
Currently those include the disassembler and the debugger (see the *samples* section).
Those GUI elements are implemented using a custom GUI abstraction, and reside in the
various `dasm_*.rb` and `debug.rb`.
The actual implementation of the GUI are found in:
* `win32.rb`: the native Win32 API backend
* `gtk.rb`: a Gtk2 backend, intended for unix platforms
* `qt.rb`: a Qt backend experiment
Please note that the Qt backend does not work *at all*.
The `gui.rb` file in the main directory is used to chose among the available GUI backend
the most appropriate for the current session.
Others
######
The other files directly in the `metasm/` directory are either support files
(eg `encode.rb`, `parse.rb`) that hold generic functions to be used by
specific cpu/exeformat instances, or implement arch-agnostic features.
Those include:
* `preprocessor.rb`: the C/asm preprocessor/lexer
* `parse_c.rb`: this is the implementation of the C parser
* `compile_c.rb`: this is a C precompiler, it generates a very simplified C from a standard source
* `decompile.rb`: the generic decompiler code, it uses arch-specific functions defined in the arch folder
* `dynldr.rb`: this module is used when interacting directly with the host operating system through <core/DynLdr.txt>
The samples
-----------
The `samples/` directory contains a lot of small files that intend to be
exemples of how to use the framework. It also holds experiments and
work-in-progress for features that may later be integrated into the main
framework.
The comment at the beginning of the file should be clear about the purpose
of the script, and the scripts are expected to be copy/pasted and tweaked
for the specific task needed by the user (that's you).
Some of those files however are full-featured applications:
* `exeencode.rb`: a shellcode compiler, with its `peencode.rb`, `elfencode.rb`, `machoencode.rb` counterparts
* `disassemble.rb`: a disassembler
* `disassemble-gui.rb`: the graphical disassembler / debugger
The `samples/dasm-plugins/` subdirectory holds various plugins for the disassembler.
+16
View File
@@ -0,0 +1,16 @@
The const_missing trick
=======================
Metasm uses a ruby trick to load most of the framework on demand, so that
*e.g.* the `MIPS`-related classes are never loaded in the ruby interpreter
unless you use them.
It is setup by the top-level `metasm.rb` file, by using the ruby mechanism of
`Module.autoload`. This mechanism will automatically load the specified metasm
components whenever a reference is made to one of the constants listed here.
Metasm provides a replacement top-level file, `misc/metasm-all.rb`,
which will unconditionally load all metasm files.
This will not however load mutually exclusive files, like the Gui subsystems ;
in this case it will load only the autodetected gui module (win32 or gtk).
+247
View File
@@ -0,0 +1,247 @@
DynLdr
======
DynLdr is a class that uses metasm to dynamically add native methods,
or native method wrappers, available to the running ruby interpreter.
It leverages the built-in C parser / compiler.
It is implemented in `metasm/dynldr.rb`.
Currently only supported for <core/Ia32.txt> and <core/X86_64.txt> under
Windows and Linux.
Basics
------
Native library wrapper
######################
The main usage is to generate interfaces to native libraries.
This is done through the `#new_api_c` method.
The following exemple will read the specified C header fragment,
define ruby constants for all `#define`/`enum`, and define ruby
method wrappers to call the native functions whose prototype is
present in the header.
All referenced native functions must be exported by the given
library file.
class MyInterface < DynLdr
c_header = <<EOS
#define SomeConst 42
enum { V1, V2 };
__stdcall int methodist(char*, int);
EOS
new_api_c c_header, 'mylib.dll'
end
Then you can call, from the ruby:
MyInterface.methodist("lol", MyInterface::SOMECONST)
Constant/enum names are converted to full uppercase, and method
names are converted to full lowercase.
Dynamic native inline function
##############################
You can also dynamically compile native functions, that are compiled
in memory and copied to RWX memory with the right ruby wrapper:
class MyInterface < DynLdr
new_func_c <<EOS
int bla(char*arg) {
if (strlen(arg) > 4)
return 1;
else
return 0;
}
EOS
end
References to external functions are allowed, and resolved automatically.
The ruby objects used as arguments to the wrapper method are
automatically converted to the right C type.
You can also write native functions in assembly, but you must specify a
C prototype, used for argument and return value conversion.
class MyInterface < DynLdr
new_func_asm "int increment(int i);", <<EOS
mov eax, [esp+4]
inc eax
ret
EOS
p increment(4)
end
Structures
----------
`DynLdr` handles C structures.
Once a structure is specified in the C part, you can create a ruby object
using `MyClass.alloc_c_struct(structname)`, which will allocate an object of the
right size to hold all the structure members, and with the right accessors.
To access/modify struct members, you can either use a `Hash`-style access
structobj['membername'] = 42
or `Struct`-style access
structobj.membername = 42
Member names are matched case-insensitively, and nested structures/unions
are also searched.
The struct members can be initially populated by passing a `Hash` argument
to the `alloc_c_struct` constructor. Additionally, this hash may use the
special value `:size` to reference the byte size of the current structure.
class MyInterface < DynLdr
new_api_c <<EOS
struct sname {
int s_mysize;
int s_value;
union {
struct {
int s_bits:4;
int s_bits2:4;
};
int s_union;
}
};
EOS
end
# field s_mysize holds the size of the structure in bytes, ie 12
s_obj = MyInterface.alloc_c_struct('sname', :s_mysize => :size, :s_value => 42)
# we can access fields using Hash-style access
s_obj['s_UniOn'] = 0xa8
# or Struct-style access
puts '0x%x' % s_obj.s_BiTS2 # => '0xa'
This object can be directly passed as argument to a wrapped function, and
the native function will receive a pointer to this structure (that it can
freely modify).
This object is a `C::AllocStruct`, defined in `metasm/parse_c.rb`.
Internally, it is based on a ruby `String`, and has a reference to the parser's
`Struct` to find the mapping membername -> offsets/length.
See <core/CParser.txt> for more details.
Callbacks
---------
`DynLdr` handles C callbacks, with arbitrary ABI.
Any number of callbacks can be defined at any time.
C callbacks are backed by a ruby `Proc`, eg `lambda {}`.
class MyInterface < DynLdr
new_api_c <<EOS
void qsort(void *, int, int, int(*)(void*, void*));
EOS
str = "sanotheusnaonetuh"
cmp = lambda { |p1, p2|
memory_read(p1, 1) <=> memory_read(p2, 1)
}
qsort(str, str.length, 1, cmp)
p str
end
Argument conversion
-------------------
Ruby objects passed to a wrapper method are converted to the corresponding
C type
* `Strings` are converted to a C pointer to the byte buffer (also directly
accessible from the ruby through `DynLdr.str_ptr(obj)`
* `Integers` are converted to their C equivalent, according to the prototype
(`char`, `unsigned long long`, ...)
* `Procs` are converted to a C callback
* `Floats` are not supported for now.
Working with memory
-------------------
DynLdr provides different ways to allocate memory.
* `alloc_c_struct` to allocate a C structure
* `alloc_c_ary` to allocate C array of some type
* `alloc_c_ptr`, which is just an ary of size 1
* `memory_alloc` allocates memory from a new memory page
`memory_alloc` works by calling `mmap` under linux and `VirtualAlloc` under windows,
and is suitable for allocating memory where you want to control
the memory permissions (read, write, execute). This is done through `memory_perm`.
`memory_perm` takes for argument the start address, the length, and the new permission, specified as a String (e.g. 'r', 'rwx')
To work with memory that may be returned by an API (e.g. `malloc`),
DynLdr provides ways to read and write arbitrary pointers from the ruby
interpreter memory.
Take care, those may generate faults when called with invalid addresses that
will crash the ruby interpreter.
* `memory_read` takes a pointer and a length, and returns a String
* `memory_read_int` takes a pointer, and returns an Integer (of pointer size,
e.g. 64 bit in a 64-bit interpreter)
* `memory_write` takes a pointer and a String, and writes it to memory
* `memory_write_int`
Hacking
-------
Internally, DynLdr relies on a number of features that are not directly
available from the ruby interpreter.
So the first thing done by the script is to generate a binary native module
that will act as a C extension to the ruby interpreter.
This binary is necessarily different depending on the interpreter.
The binary name includes the target architecture, in the format
dynldr-*arch*-*cpu*-*19*.so, e.g.
* dynldr-linux-ia32.so
* dynldr-windows-x64-19.so
This native module is (re)generated if it does not exist, or is older than the
`dynldr.rb` script.
A special trick is used in this module, as it does not know the actual name
of the ruby library used by the interpreter. So on linux, the `libruby` is
removed from the `DT_NEEDED` library list, and on windows a special stub
is assembled to manually resolve the ruby imports needed by the module from
any instance of `libruby` present in the running process.
The native file is written to a directory writeably by the current user.
The following list of directories are tried, until a suitable one is found:
* the `metasm` directory itself
* the `$HOME`/`$APPDATA`/`$USERPROFILE` directory
* the `$TMP`/`$TEMP`/current directory
+43
View File
@@ -0,0 +1,43 @@
ExeFormat
=========
This class is the parent of all executable format handlers.
It is defined in `metasm/exe_format/main.rb`.
It defines some standard shortcut functions, such as:
* `Exe.decode_file(filename)`
* `Exe.assemble(cpu,asm_source)`
* `Exe.compile_c(cpu,c_source)`
* `Exe#encode_file(filename)`
These methods will instanciate a new Exe, and call the corresponding
methods, *e.g.* `load` with the file content, and `decode`.
The handling of the different structures in the binary format should be
done using the <core/SerialStruct.txt> facility.
The subclasses are expected to implement various functions, depending on the
usage (refer to the ELF and COFF implementations for more details):
File decoding/disassembly
-------------------------
* `#decode_header`: parse the raw data in `#encoded` only to parse the file header
* `#decode`: parse all the raw data in `#encoded`
* `#cpu_from_headers`: return a <core/CPU.txt> instance according to the exe header information
* `#get_default_entrypoints`: the list of entrypoints (exported functions, etc)
* `#dump_section_header`: return a string that may be assembled to recreate the specified section
* `#section_info`: return a list of generic section informations for the disassembler
File encoding/source parsing
----------------------------
* `#tune_prepro`: define exe-specific macros for the preprocessor (optional)
* `#parse_init`: initialize the `@cursource` array to receive the parsed asm source
* `#parse_parser_instruction`: parse exe-specific instructions, eg `.text`, `.import`...
* `#assemble`: assemble the content of the @cursource into binary section contents
* `#encode`: assemble the various sections and a binary header into `@encoded`
+220
View File
@@ -0,0 +1,220 @@
Expression
==========
Metasm uses this class to represent arbitrary symbolic arithmetic expressions, e.g.
* `42`
* `eax + 12`
* `loc_4228h + 4*ebx - 12`
These expressions can include `Integers`, `Symbols`, and `Strings`.
The symbols and strings represent arbitrary variables, with the convention that
strings represent fixed quantities (eg addresses, labels), whereas symbols
represent more variable stuff (eg register values).
There is also a special symbol that may be used, `:unknown`, to represent a
value that is known to be unknown. See the `reduce` section.
See also <core/Indirection.txt>.
The Expression class holds all methods relative to Integer binary manipulation,
that is `encoding` and `decoding` from/to a binary blob (see also
<core/EncodedData.txt>)
Members
-------
Expressions hold exactly 3 members:
* `lexpr`, the left-hand side of the expression
* `rexpr`, the right-hand side
* `op`, the operator
`lexpr` and `rexpr` can be any value, most often String, Symbol, Integer or
Expression. For unary operators, `lexpr` is `nil`.
`op` is a Symbol representing the operation.
It should be from the list:
* arithmetic: `+ - / * >> << & | ^`
* boolean: `|| && == != > >= < <=`
* unary: `+ - ~ !`
Instantiation
-------------
In ruby code, use the class method `[]`. It takes 1 to 3 arguments, `lexpr`,
`op`, and `rexpr`. `lexpr` defaults to `nil`, and `op` defaults to `:+` (except
for negative numeric values, which is stored with `op` == `:-` and `rexpr` ==
abs).
If `lexpr` or `rexpr` are an `Array`, the `[]` constructor is called
recursively, to ease the definition of nested Expressions.
Exemples:
Expression[42]
Expression[:eax, :+, 12]
Expression[:-, 'my_var']
Expression[[:eax, :-, 4], :*, [:ebx, :+, 0x12]]
The Expression class also includes a parser, to allow creating an expression
from a string. `parse_string!` will create an Expression and update its
argument to point after the last part read successfully into the expr.
The parser handles standard C operator precedence.
str = "1 + var"
Expression.parse_string!(str) # => Expression[1, :+, "var"]
str = "42 bla"
Expression.parse_string!(str) # => Expression[42]
str # => "bla"
Use `parse_string` without the ! to parse the string without updating it.
External variables
------------------
The `externals` method will return all non-integer members of the Expression.
Expression[[:eax, :+, 42], :-, "bla"].externals # => [:eax, "bla"]
Pattern matching
----------------
The `match` method allows to check an Expression against a pattern without
having to check individual members. The pattern should be an Expression,
whose variable members should be Strings or Symbols, which are also passed as
arguments to the match function. On successful match, the correspondance
between variable patterns and their actual value matched is returned as a Hash.
Expression[1, :+, 2].match(Expression['var', :+, 2], 'var')
# => { 'var' => 1 }
Expression[1, :+, 2].match(Expression['var', :+, 'var'], 'var')
# => nil
Expression[1, :+, 1].match(Expression['var', :op, 'var'], 'var', :op)
# => { 'var' => 1, :op => :+ }
Reduction
---------
Metasm Expressions include a basic symbolic computation engine, that allows
some simple transformations of the Expression. The reduction will also
compute numerical values whenever possible. If the final result is fully
numeric, an Integer is returned, otherwise a new Expression is returned.
In this context, the special value `:unknown` has a particular meaning.
Expression[1, :+, 2].reduce
# => 3
Expression[:eax, :+, [:ebx, :-, :eax]].reduce
# => Expression[:ebx]
Expression[1, :+, [:eax, :+, 2]].reduce
# => Expression[:eax, :+, 3]
Expression[:unknown, :+, :eax].reduce
# => Expression[:unknown]
The symbolic engine operates mostly on addition/substractions, and
no-operations (eg shift by 0). It also handles some boolean composition.
The detail can be found in the #replace_rec method body, in `metasm/main.rb`.
The reduce method can also take a block argument, which will be called at
every step in the recursive reduction, for custom operations. If the block
returns nil, the result is unchanged, otherwise the new value is used as
replacement. For exemple, if you operate on 32-bit values and want to get rid
of `bla & 0xffffffff`, use
some_expr.reduce { |e|
if e.kind_of?(Expression) and e.op == :& and e.rexpr == 0xffff_ffff
e.lexpr
end
}
Binding
-------
An expression involving variable externals can be bound using a Hash. This will
replace any occurence of a key of the Hash by its value in the expression
members. The `bind` method will return a new Expression with the substitutions,
and the `bind!` method will update the Expression in-place.
Expression['val', :+, 'stuff'].bind('val' => 4, 'stuff' => 8).reduce
# => 12
Expression[:eax, :+, :ebx].bind(:ebx => 42)
# Expression[:eax, :+, 42]
Expression[:eax, :+, :ebx].bind(:ebx => :ecx)
# Expression[:eax, :+, :ecx]
You can use Expressions as keys, but they will only be used on perfect matches.
Binary packing
--------------
Encoding
########
The `encode` method will generate an EncodedData holding the expression, either
as binary if it can reduce to an integral value, or as a relocation.
The arguments are the relocation type and the endianness, plus an optional
backtrace (to notify the user where an overflowing relocation comes from).
The `encode_imm` class method will generate a raw String for a given
integral value, a type and an endianness.
The type can be given as a byte size.
Expression.encode_imm(42, :u8, :little) # => "*"
Expression.encode_imm(42, 1, :big) # => "*"
Expression.encode_imm(256, :u8, :little) # raise EncodeError
On overflows (value cannot be encoded in the bit field) an EncodeError
exception is raised.
Decoding
########
The `decode_imm` class method can be used to read a binary value into an
Integer, with an optional offset into the binary string.
Expression.decode_imm("*", :u8, :little) # => 42
Expression.decode_imm("bla\xfe\xff", :i16, :little, 3) # => -2
Arithmetic coercion
-------------------
Expression implement the `:+` and `:-` ruby methods, so that `expr + 4`
works as expected. The result is reduced.
Integer methods
---------------
The Expression class offers a few methods to work with integers.
make_signed
###########
`make_signed` will convert a raw unsigned to its equivalent signed value,
given a bit size.
Expression.make_signed(1, 16) # => 1
Expression.make_signed(0xffff, 16) # => -1
in_range?
#########
`in_range?` can check if a given numeric value would fit in a particular
<core/Relocation.txt> field. The method can return true or false if it
fits or not, or `nil` if the result is unknown (eg the expr has no numeric
value).
Expression.in_range?(42, :i8) # => true
Expression.in_range?(128, :i8) # => false
Expression.in_range?(-128, :i8) # => true
Expression.in_range?(Expression['bla'], :u32) # => nil
+27
View File
@@ -0,0 +1,27 @@
GNUExports
==========
This class is defined in `metasm/os/gnu_exports.rb`
It defines an `EXPORT` constant, a Hash, whose keys
are the standard linux API symbol names, and values
are the library name where you can find this symbol.
The equivallent for windows is <core/WindowsExports.txt>
Usage
-----
The main usage of this class is the automatic generation
of the <core/ELF.txt> dynamic tag `DT_NEEDED` from the
external symbols referenced by a binary during compilation.
This is done in the `automagic_symbols` method.
Symbols
-------
The current version holds the symbols of the debian
glibc, from `libc.so.6` and `libdl.so.2`.
Ruby symbols are also defined, from `libruby1.8.so.1.8`.
+234
View File
@@ -0,0 +1,234 @@
Ia32
====
The Ia32 architecture, aka *Intel_x86*, is the most advanced among the
architectures implemented in the framework. It is a subclass of the
generic <core/CPU.txt>.
It can handle binary code for the 16 and 32bits modes of the processor.
It is a superclass for the <core/X86_64.txt> object, a distinct processor
that handles 64-bit *long_mode* (aka *x64*, *amd64*, *em64t*)
The CPU `shortname` is `ia32` (`ia32_16` in 16-bit mode, and a `_be` suffix
if bigendian)
Opcodes
-------
The opcodes list can be customized to match that available on a specific
version of the processor. The possibilities are:
* 386_common
* 386
* 387
* 486
* pentium
* p6
* 3dnow
* sse
* sse2
* sse3
* vmx
* sse42
Most opcodes are available in the framework, with the notable exception of:
* most sse2 simd instructions
* the AVX instructions
* amd-specific instructions
The `386_common` family is the subset of 386 instruction that are most
commonly found in standard usermode programs (no `in`/`out`/bcd
arithmetic/far call/etc).
This can be useful when manipulating stuff that in not known to be i386
binary code.
Initialization
--------------
An Ia32 <core/CPU.txt> object can be created using the following code:
Metasm::Ia32.new
The `X86` alias may be used in place of `Ia32`.
The constructor accepts optional arguments to specify the CPU size, the
opcode family, and the endianness of the processor. The arguments can
be given in any order. For exemple,
Metasm::Ia32.new(16, 'pentium', :big)
will create a 16-bit mode cpu, with opcodes up to the 'pentium' CPU family,
in big-endian mode.
The Ia32 initializer has the convenience feature that it will create an
X86_64 instance when given the 64 bit size (e.g. `Ia32.new(64)` returns an
X86_64 instance)
Assembler
---------
The parser handles only Intel-style asm syntax, *e.g.*
some_label:
mov eax, 10h
mov ecx, fs:[eax+16]
push dword ptr fs:[1Ch]
call ecx
test al, al
jnz some_label
ret
fmulp ST(4)
Instruction arguments
#####################
The parser recognizes standard registers, such as
* `eax`
* `ah`
* `mm4` (mmx 64bit register)
* `xmm2` (xmm 128bit register)
* `ST` (current top of the FPU stack)
* `ST(3)` (FPU reg nr.3)
* `cs` (segment register)
* `dr3` (debug register)
* `cr2` (control register)
It also supports inexistant registers, such as
* `cr7`
* `dr4`
* `segr6` (segment register nr.6)
The indirections are called `ModRM`. They take the form:
* `[eax]` (memory pointed by `eax`)
* `byte ptr [eax]` (1-byte memory pointed by `eax`)
* `byte [eax]` (same as previous)
* `fs:[eax]` (offset `eax` from the base of the `fs` segment)
* `[fs:eax]` (same as previous)
The pointer itself can be:
* `[eax]` (any register)
* `[eax+12]` (base + numeric offset)
* `[eax+ebx]` (base + register index)
* `[eax + 4*ebx]` (base + 1,2,4 or 8 * index)
* `[eax + 2*ebx + 42]` (both)
Note that the form base + s*index cannot use `esp` as index with s != 1.
For indirection sizes, the size is taken from the size of other arguments
if it is not specified (eg `mov eax, [42]` will be 4 bytes, and `mov al, [42]`
will be 1). The explicit size specifier can be:
* `byte` (8bits)
* `word` (16)
* `dword` (32)
* `qword` (64)
* `oword` (128)
* `_12bits` (12, arbitrary numbers can be used)
Parser commands
###############
The following commands are recognized in an asm source:
* `.mode`
* `.bits`
They are synonymous, and serve to change the mode of the processor to either
16 or 32bits.
They should be the first instruction in the source, changing the mode during
parsing is not supported. This would change only the mode for the next
instructions to be parsed, and for all instructions (incl. those already parsed
at this point) when encoding, which is likely **not** what you want. See the
`codeXX` prefixes.
Note that changing the CPU size once it was created may have bad side-effects.
For exemple, some preprocessor macros may already have been generated according
to the original size of the CPU and will be incorrect from this point on.
Prefixes
########
The following prefixes are handled:
* `lock`
* `rep`, `repz`, `repnz`, `repe`, `repne`
* `code16`, `code32`
The `repXX` prefixes are for string operations (`movsd` etc), but will be set
for any opcode. Only the last of the family will be encoded.
The `code16` will generate instructions to be run on a CPU in 16bit mode,
independantly of the global CPU mode. For exemple,
code16 mov ax, 42h
will generate `"\xb8\x42\x00"` (no opsz override prefix), and will decode or
run incorrectly on an 32bit CPU.
The encoder also has code to handle `jmp hints` prefixes, but the parser has
no equivalent prefix support.
There is currently no way to specify a segment-override prefix for instruction
not using a ModRM argument. Use a `db 26h` style line.
Suffixes
########
The parser implements a specific feature to allow the differenciation of
otherwise ambiguous opcodes, in the form of instruction suffixes.
By default, the assembler will generate the shortest encoding for a given
instruction. To force encoding of another form you can add a specific
suffix to the instruction. In general, metasm will use e.g. register sizes
when possible to avoid this kind of situations, but with immediate-only
displacement this is necessary.
or.a16 [1234h], eax ; use a 16-bit address
or [bx], eax ; use a 16-bit address (implicit from the bx register)
or eax, 1 ; "\x83\xc8\x01"
or.i8 eax, 1 ; "\x83\xc8\x01" (same, shortest encoding)
or.i eax, 1 ; "\x81\xc8\x01\x00\x00\x00" (constant stored in a 32bit field)
movsd.a16 ; use a 16-byte address-size override prefix (copy dword [si] to [di])
push.i16 42h ; push a 16-bit integer
The suffixes are available as follow:
* if the opcode takes an integer argument that can be encoded as either a 8bits or <cpu size>bits, the `.i` and `.i8` variants are created
* if the opcode takes a memory indirection as argument, or is a string operation (`movsd`, `scasb`, etc) the `.a16` and `.a32` variants are created
* if the opcode takes a single integer argument, a far pointer, or is a return instruction, the `.i16` and `.i32` variants are created
C parser
--------
The Ia32 C parser will initialize the type sizes with the `ilp32` memory
model, which is:
* short = 16bits
* int = 32bits
* long = 32bits
* long long = 64bits
* pointer = 32bits
In 16bit mode, the model is `ilp16`, which may not be correct (the 16bits
compiler has not been tested anyway).
The following macros are defined (in the asm preprocessor too)
* `_M_IX86` = 500
* `_X86_`
* `__i386__`
+108
View File
@@ -0,0 +1,108 @@
SerialStruct
============
This is a helper class to handle binary packed data, especially to
represent <core/ExeFormat.txt> structures.
The implementation is in `metasm/exe_format/serialstruct.rb`.
Basics
------
The class defines some class methods, such as:
* `dword`
* `byte`
* `strz`
These methods can be used directly in subclass definitions, e.g.
class MyHeader < SerialStruct
dword :signature
dword :length
end
This will associate the sequence of fields to this structure, which
is used in the `#encode` and `#decode` methods.
These methods rely on an <core/ExeFormat.txt> instance to define
the corresponding `decode_dword` and `encode_dword` methods.
You can then simply call:
hdr = MyHeader.decode(myexefmt)
which will call `myexefmt.decode_word` twice to populate the
`signature` and `length` fields of the MyHeader.instance.
You can also redefine the `#decode` method to handle special cases.
The fields defined this way can be assigned a default value that
will be used when encoding the structure. The syntax is:
dword :fieldname, defaultvalue
If you have a long sequence of identically-typed fields, you can use
the plural form:
dwords :f1, :f2, :f3, :f4
To define your own field types, you should create a new subclass and call the
`new_field` class method. For integral fields, use `new_int_field(fldname)`
that will automatically define the decode/encode routines, and create the
plural form.
class MyStruct < SerialStruct
new_int_field :zword
zwords :offset, :length
end
Symbolic constants
------------------
The class has built-in support for symbolic constants and bit fields.
For exemple, suppose you have a numeric `:type` field, which corresponds
to a set of numeric constants `TYPE_FOO TYPE_BAR TYPE_LOL`. You can use:
TYPES = { 2 => 'FOO', 3 => 'BAR', 4 => 'LOL' }
dword :type
fld_enum :type, TYPES
With this, the standard '#decode' method will first decode the numeric value
of the field, and then lookup the value in the enum hash to find the
corresponding symbol, and use it as the field value.
If there is no mapping, the numeric value is retained. The reverse operation
is done with `#encode`.
For the bitfields, the method is `fld_bits`, and the software will try to
match *OR-ed* values from the bitfield to generate an array of symbols.
BITS = { 1 => 'B1', 2 => 'B2', 4 => 'B4' }
dword :foo
fld_bits :foo, BITS
which will give, for the numeric value `0x15`, `["B1", "B4", 0x10]`
The hashes used for fld_bits or fld_enum can be dynamically determined, by
using the block version of those methods. The block will receive the ExeFormat
instance and the SerialStruct instance, and should return the Hash.
This can be useful when a bitfield signification varies given some generic
property of the exe, eg the target architecture.
Hooks
-----
It is also possible to define a hook that will be called at some point during
the object binary decoding. It will receive the exe and struct instances.
class Header < SerialStruct
dword :machine
decode_hook { |exe, hdr| raise "unknown machine" if hdr.machine > 4 }
dword :bodylength
end
+145
View File
@@ -0,0 +1,145 @@
VirtualString
=============
This class is an abstract representation of an arbitrary sized byte array
with methods to load parts of it on demand. It is useful to represent
a program virtual memory and allow metasm to work on it while only reading
bytes from it when actually needed.
The base class is defined in `metasm/os/main.rb`.
Basics
------
The API of the object is designed to be compatible with a standard String (ASCII-8BIT).
The main restriction is that the size of this string cannot be changed:
concatenation / shortening is not supported.
The main operation on the object should be `[]` and `[]=`, that is,
reading some subpart of the string, or overwriting some substring.
The arguments are the same as for a String, with the exception that
rewrite raises an IndexError if the rewriting would change the string
length.
A few methods are written specifically with the VirtualString semantics,
others are redirected to a temporary real String generated with `realstring`.
The VirtualString works with a `page` concept, that represents some arbitrary
chunks of data that can be actually read from the underlying target, e.g. a
memory page (4096 bytes) when mapping a process virtual address space.
Instances get to define a `pagelength` sound for the specific implementation.
Whenever a substring is requested from a VirtualString, if the substring
length is less than the page size, an actual read is made and a String is
returned.
If the length is greater however, a new VirtualString is created to map this
new *view* without actually reading.
To force the conversion to a String, use the `realstring` or `to_str` method.
The latter is prefered, as it works on both Strings and VirtualStrings.
To force the creation of a new VirtualString, use the `dup(start, len)` method.
When reading actual bytes, a local page cache is used. By default is has only 4
pages, and can be invalidated using `invalidate`.
The cache is automatically invalidated when part of the string is written to.
The VirtualString may index *invalid* pages (e.g. unmapped memory range in a
process address space) ; you can check that with `page_invalid?` with an index
as parameter.
Creation
--------
To create your own flavor of VirtualString, you must:
* define your subclass that inherits from `VirtualString`
* define your initializer, that takes whatever arguments make sense (e.g. a
*pid*, *handle*, Socket..)
* your initializer must call super(a, l) with arguments:
** current view absolute address (should default to 0), will be saved in
`@addr_start`
** current view size (should default to something sensible, like 1<<32), saved
in `@length`
* your initializer can override the default page size by defining the
`@pagelength` variable.
* implement a `dup` method that takes optional arguments:
** new base address (default=`@addr_start`)
** new length (default=`@length`)
** returns a new instance of your class mapping over the specified window
* implement a `get_page` method, whose arguments are:
** absolute page address (will always be page-aligned)
** optional length, default=`@pagelength`
** returns a String of `length` bytes, or `nil` (e.g. unmapped area)
* optionally implement a `rewrite_at` method, to make your string writeable.
Arguments are the absolute write address, and the data to write there (a String).
Feel free to override any other method with an optimized version.
For exemple, the default `realstring` will repeatadly call `get_page` with
each page in the range 0..`length`, you may have a more efficient alternative.
You can alter the cache size by rewriting the `@pagecache_len` variable
**after** calling `super()` in `initialize`. The default value is 4, which you
may want to increase.
See the `WindowsRemoteString` source for a simple exemple (ignore the `open_pid`
method).
Standard subclasses
-------------------
VirtualFile
###########
Defined in `metasm/os/main.rb`.
This class maps over an open file descriptor, and allows reading data on-demand.
It implements the `read` class method, similar to `File.read`, with the
file opened in binary mode. For a small file (<=4096), the content is
directly returned, otherwise a VirtualString is created.
This class is used by the default <core/ExeFormat.txt> `decode_file[_header]`
methods.
LinuxRemoteString
#################
Defined in `metasm/os/linux.rb`.
This class maps over the virtual memory of a Linux process.
Accesses are done through the `/proc/<pid>/mem` for reading.
The linux kernel requires that the target process be ptraced before we can
read this file, so the object will use the debugger instance passed to the
constructor, or create a new <core/PTrace.txt> object to stop the process
and read its memory during `get_page`.
If a <core/Debugger.txt> object was given, `get_page` will return `nil` if the
debugger indicates that the target is not stopped.
Writing is done through `PTrace#writemem` using `PTRACE_POKEDATA`.
WindowsRemoteString
###################
Defined in `metasm/os/windows.rb`.
This class maps over the virtual memory of a Windows process.
The memory accesses are done using the `Read/WriteProcessMemory` API.
The class method `open_pid` is defined, that will try to `OpenProcess`
first in read/write, and fallback to read-only mode.
GdbRemoteString
###############
Defined in `metasm/os/remote.rb`.
Maps over the virtual memory of a remote process debugged with a
<core/GdbClient.txt> instance, using `setmem` and `getmem`.
+61
View File
@@ -0,0 +1,61 @@
WindowsExports
==============
This class is defined in `metasm/os/windows_exports.rb`
It defines an `EXPORT` constant, a Hash, whose keys
are the standard win32 API symbol names, and values
are the library name where you can find this symbol.
The equivalent for GNU/Linux is <core/GNUExports.txt>
Usage
-----
The main usage of this class is the automatic generation
of the <core/PE.txt> import directories from the
external symbols referenced by a binary during compilation.
This is done in the `automagic_symbols` method.
Symbols
-------
The current version holds the symbols available in the
Windows XP SP2 32-bit standard libraries:
* `ntdll`
* `kernel32`
* `user32`
* `gdi32`
* `advapi32`
* `ws2_32`
* `msvcrt`
* `comdlg32`
* `psapi`
Ruby symbols are also defined, from `msvcrt-ruby18`.
Ruby library name
-----------------
On creation, the current ruby library name is inferred
from the `RUBY_PLATFORM` constant, in an effort to
try to use the available ruby library filename.
The only transformation supported now is to rewrite
the ruby version number appearing in the filename for
msvcrt-compiled binaries, so that you get the correct
`msvcrt-ruby192` name for exemple under ruby1.9.
This is implemented in the `patch_rubylib_to_current_interpreter`
method (which is aptly named).
Warning
#######
Note that binaries compiled this way will not work on
other machines where the exact same library is unavailable.
+1
View File
@@ -0,0 +1 @@
See <core_classes.txt>
+75
View File
@@ -0,0 +1,75 @@
Core classes
============
Core
----
* <core/Expression.txt>
* <core/EncodedData.txt>
* <core/VirtualString.txt>
* <core/Opcode.txt>
* <core/Instruction.txt>
CPUs
----
* <core/CPU.txt>
* <core/Ia32.txt>
* <core/X86_64.txt>
* <core/MIPS.txt>
* <core/PowerPC.txt>
* <core/Sh4.txt>
ExeFormats
----------
* <core/ExeFormat.txt>
* <core/SerialStruct.txt>
* <core/AutoExe.txt>
* <core/Shellcode.txt>
* <core/PE.txt>
* <core/COFF.txt>
* <core/ELF.txt>
C
----
* <core/Preprocessor.txt>
* <core/CParser.txt>
* <core/CCompiler.txt>
Debugger
--------
* <core/OS.txt>
* <core/Debugger.txt>
* <core/LinDebugger.txt>
* <core/WinDebugger.txt>
* <core/PTrace.txt>
* <core/GdbClient.txt>
* <core/WinDbgAPI.txt>
Disassembler
------------
* <core/Disassembler.txt>
* <core/DecodedFunction.txt>
* <core/DecodedInstruction.txt>
* <core/InstructionBlock.txt>
* <core/Decompiler.txt>
GUI
----
* <core/Gui.txt>
* <core/Gui_Drawable.txt>
* <core/Gui_Window.txt>
* <core/Gui_DasmWidget.txt>
* <core/Gui_DebugWidget.txt>
Others
------
* <core/DynLdr.txt>
+53
View File
@@ -0,0 +1,53 @@
Metasm feature list
===================
Metasm is a cross-architecture assembler, disassembler, compiler, linker and debugger.
See <use_cases.txt>
Architectures
-------------
It is written in such a way that it is easy to add support for new architectures.
For now, the following architectures are in:
* Intel <core/Ia32.txt> (16 and 32bits)
* Intel <core/X86_64.txt> (*aka* Ia32 64bits, X64, AMD64)
* MIPS
* PowerPC
* Sh4
The developpement is generally more focused on Ia32 and X86_64.
File formats
------------
The following executable file formats are supported:
* <core/Shellcode.txt> (raw binary)
* <core/PE.txt>/<core/COFF.txt> (32/64bits)
* <core/ELF.txt> (32/64bits)
Those are supported in a more limited way:
* Mach-O, UniversalBinary
* MZ
* A.out
* XCoff
* NDS
Features
--------
The framework includes
* a graphical <usage/disassembler.txt>
* a graphical <usage/debugger.txt>
* low and high-level debugging support (Ia32 only for now) under Windows, Linux and remote (via a GdbServer)
* an advanced disassembler engine, with limited emulation support
* a full <usage/C_parser.txt> (with preprocessor)
* an experimental <usage/C_compiler.txt> (Ia32 only)
* an experimental <usage/decompiler.txt> (Ia32 only)
+59
View File
@@ -0,0 +1,59 @@
The Metasm framework documentation
==================================
Metasm
------
The Metasm framework is an opensource software designed to interact with
the various forms of binary code. It is written in pure Ruby
(<http://ruby-lang.org/>).
More detailed informations can be found in the <feature_list.txt>.
It is distributed freely under the terms of the LGPL.
Documentation organisation
--------------------------
This documentation is split in different parts :
* the <core_classes.txt>
* the major <use_cases.txt>
* <code_organisation.txt>
The first part describes the internal structure of the framework, the
second part is a higher level overview of the software and shows how
the various parts are used and can interract. The last part explains
the role of the source files and directories.
Documentation progress
----------------------
The documentation is written here and there in my free time, and is **very**
**incomplete** as of now. Specifically, all internal links you'll find
ending in `.txt` are link to pages that have not been written yet.
Install notes
-------------
See the <install_notes.txt>
Authors
-------
Metasm is mostly written by Yoann Guillot.
Some parts were added by various contributors, including :
* Julien Tinnès
* Raphaël Rigo
* Arnaud Cornet
* Alexandre Gazet
Contact
-------
The latest version of this documentation can be found on the Metasm site: <http://metasm.cr0.org/doc>
Patches, bug reports, feature requests should be sent to metasm@cr0.org
+170
View File
@@ -0,0 +1,170 @@
Metasm installation notes
=========================
Metasm is a pure ruby lib, and the core (`metasm/` subdir) does not depend on any
ruby library (except the `metasm/gui`, which may use `gtk2`).
So the install is quite simple.
Download
--------
Metasm is distributed using the `mercurial` source control system.
The recommanded way to install is to use that tool, so you can always be
up-to-date with the latest developpements.
You will also need the Ruby interpreter (version 1.8 and 1.9 are supported).
Linux
#####
Issue the following commands to install the `mercurial` and `ruby` software
sudo apt-get install ruby
sudo apt-get install mercurial
Then download metasm with
hg clone http://metasm.cr0.org/hg/metasm/
This will create a new directory `metasm/` with the latest version of the
framework.
Windows
#######
The ruby website offers many ruby packages. The *RubyInstaller* should
work fine. Go to <http://www.ruby-lang.org/en/downloads/>, under the
`Ruby on Windows` section.
The `mercurial` website has links to various installers:
<http://mercurial.selenic.com/wiki/BinaryPackages>
Choose one, then use the `clone repository` command with the following
url:
http://metasm.cr0.org/hg/metasm/
This will create a new subdirectory `metasm/` with the latest version of
the framework.
Upgrading
---------
To upgrade to the latest and greatest version, launch a shell prompt and
navigate to the metasm directory, then issue:
hg pull -u
which will upgrade your installation to the latest available version.
With `TortoiseHG`, simply issue the `upgrade` command on the `metasm`
directory.
Local installation
------------------
If you simply want to install metasm for your personnal usage (VS a
system-wide installation), follow these steps.
Download the metasm source files under any directory, then update the
environment variable `RUBYLIB` to include this path. The path you add
should be the directory containing the `metasm.rb` script and the `metasm/`,
`samples/`, `doc/` subdirectories.
If `RUBYLIB` is empty or non-existant, simply set its value to the directory,
otherwise you can append the path to an existing list by separating the values
with a `:` such as:
RUBYLIB='/foo/bar:/home/jj/metasm'
Linux
#####
Under linux or cygwin, this is done by modifying your shell profile, e.g.
`~/.bash_profile`, by adding a line such as:
export RUBYLIB='/home/jj/metasm'
You may need to restart your session or start a new shell for the changes
to take effect.
Windows
#######
The environment variables can be set through :
* rightclick on `my computer`
* select tab `advanced`
* click `environment variables`
If a line RUBYLIB exists, add `;C:\path\to\metasm` at the end, otherwise
create a new variable `RUBYLIB` with the path as value.
You may need to restart your session for the changes to take effect.
Systemwide installation
-----------------------
For a systemwide installation, you should create a `metasm.rb` file in the `site_ruby`
directory (that would be `/usr/lib/ruby/1.8/` under linux, or `C:\apps\ruby\lib\ruby\1.8\`
for windows users) with the content
# if metasm.rb can be found in /home/jj/metasm/metasm.rb
require '/home/jj/metasm/metasm'
Testing
-------
Open a new shell session and type
ruby -r metasm -e "p Metasm::VERSION"
It should print a single line with a (meaningless) number in it.
Gui
----
If you intend to use the graphical user-interface (debugger/disassembler),
if you are under Windows with a 32bit x86 ruby, this should work out of the
box. In any other case, you'll need the `ruby-gtk2` library.
Linux
#####
Under linux, use your package manager to install `ruby-gtk2`, e.g. for
Debian/Ubuntu, type:
sudo apt-get install libgtk2-ruby
Windows
#######
If you run a 32bit Ia32 ruby interpreter (check that `ruby -v` returns
something like `[i386-mswin32]`), the Gui should work right away without
`gtk2`, so go directly to the `Testing` part.
Otherwise, you'll need to install the `gtk2` libs and the ruby bindings
manually. Please follow the instructions at
<http://ruby-gnome2.sourceforge.jp/hiki.cgi?Install+Guide+for+Windows>
Testing
#######
To test the correct working of the Gui, simply launch the
`samples/disassemble-gui.rb` script found in the metasm directory
(double-click on the script, or type `ruby samples/disassemble-gui.rb` at
a command prompt). It should display a window with a menu, and should
answer to a `ctrl-o` keystroke with an `open binary file` dialog.
See the <usage/disassembler_gui.txt> for more information.
+3
View File
@@ -0,0 +1,3 @@
span.quote {
font-family: monospace;
}
+1
View File
@@ -0,0 +1 @@
See <use_cases.txt>
+18
View File
@@ -0,0 +1,18 @@
Metasm use cases
================
Metasm is intended to be a binary manipulation toolbox.
There are quite a lot of possible usages that can be derived from the
<feature_list.txt>.
The major would be related to:
* the scriptable <usage/debugger.txt>
* the <usage/disassembler.txt> (with the optionnal <usage/disassembler_gui.txt>)
* the <usage/assembler.txt>
* the <usage/C_parser.txt>
* the <usage/C_compiler.txt>
* the <usage/exe_manipulation.txt> facilities
and various interaction between those.
+2 -9
View File
@@ -15,7 +15,6 @@ module Metasm
Const_autorequire_equiv = {
'X86' => 'Ia32', 'PPC' => 'PowerPC',
'X64' => 'X86_64', 'AMD64' => 'X86_64',
'MIPS64' => 'MIPS',
'UniversalBinary' => 'MachO', 'COFFArchive' => 'COFF',
'DEY' => 'DEX',
'PTrace' => 'LinOS', 'FatELF' => 'ELF',
@@ -33,9 +32,8 @@ module Metasm
# files to require to get the definition of those constants
Const_autorequire = {
'Ia32' => 'cpu/ia32', 'MIPS' => 'cpu/mips', 'PowerPC' => 'cpu/ppc', 'ARM' => 'cpu/arm',
'X86_64' => 'cpu/x86_64', 'Sh4' => 'cpu/sh4', 'Dalvik' => 'cpu/dalvik', 'ARC' => 'cpu/arc',
'Python' => 'cpu/python', 'Z80' => 'cpu/z80', 'CY16' => 'cpu/cy16', 'BPF' => 'cpu/bpf',
'Ia32' => 'ia32', 'MIPS' => 'mips', 'PowerPC' => 'ppc', 'ARM' => 'arm',
'X86_64' => 'x86_64', 'Sh4' => 'sh4', 'Dalvik' => 'dalvik',
'C' => 'compile_c',
'MZ' => 'exe_format/mz', 'PE' => 'exe_format/pe',
'ELF' => 'exe_format/elf', 'COFF' => 'exe_format/coff',
@@ -43,15 +41,10 @@ module Metasm
'AOut' => 'exe_format/a_out', 'MachO' => 'exe_format/macho',
'DEX' => 'exe_format/dex',
'NDS' => 'exe_format/nds', 'XCoff' => 'exe_format/xcoff',
'GameBoyRom' => 'exe_format/gb',
'Bflt' => 'exe_format/bflt', 'Dol' => 'exe_format/dol',
'PYC' => 'exe_format/pyc', 'JavaClass' => 'exe_format/javaclass',
'SWF' => 'exe_format/swf', 'ZIP' => 'exe_format/zip',
'Shellcode_RWX' => 'exe_format/shellcode_rwx',
'Gui' => 'gui',
'WindowsExports' => 'os/windows_exports',
'GNUExports' => 'os/gnu_exports',
'Debugger' => 'debug',
'LinOS' => 'os/linux', 'WinOS' => 'os/windows',
'GdbClient' => 'os/remote',
'Disassembler' => 'disassemble',
@@ -5,7 +5,8 @@
require 'metasm/main'
require 'metasm/cpu/ppc/parse'
require 'metasm/cpu/ppc/encode'
require 'metasm/cpu/ppc/decode'
require 'metasm/cpu/ppc/decompile'
require 'metasm/arm/parse'
require 'metasm/arm/encode'
require 'metasm/arm/decode'
require 'metasm/arm/render'
require 'metasm/arm/debug'
@@ -4,7 +4,7 @@
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/arm/opcodes'
require 'metasm/arm/opcodes'
module Metasm
class ARM
@@ -15,7 +15,7 @@ class ARM
@dbg_register_flags ||= :flags
end
def dbg_register_list
def dbg_register_list
@dbg_register_list ||= [:r0, :r1, :r2, :r3, :r4, :r5, :r6, :r7, :r8, :r9, :r10, :r11, :r12, :sp, :lr, :pc]
end
@@ -3,7 +3,7 @@
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/arm/opcodes'
require 'metasm/arm/opcodes'
require 'metasm/decode'
module Metasm
@@ -38,7 +38,7 @@ class ARM
end
def decode_findopcode(edata)
return if edata.ptr+4 > edata.length
return if edata.ptr >= edata.data.length
di = DecodedInstruction.new(self)
val = edata.decode_imm(:u32, @endianness)
di.instance_variable_set('@raw', val)
@@ -58,11 +58,11 @@ class ARM
op = di.opcode
di.instruction.opname = op.name
val = di.instance_variable_get('@raw')
field_val = lambda { |f|
r = (val >> @fields_shift[f]) & @fields_mask[f]
case f
when :i12; Expression.make_signed(r, 12)
when :i16; Expression.make_signed(r, 16)
when :i24; Expression.make_signed(r, 24)
when :i8_12; ((r >> 4) & 0xf0) | (r & 0xf)
when :stype; [:lsl, :lsr, :asr, :ror][r]
@@ -88,8 +88,7 @@ class ARM
di.instruction.args << case a
when :rd, :rn, :rm; Reg.new field_val[a]
when :rm_rs; Reg.new field_val[:rm], field_val[:stype], Reg.new(field_val[:rs])
when :rm_is; Reg.new field_val[:rm], field_val[:stype], field_val[:shifti]
when :i12; Expression[field_val[a]]
when :rm_is; Reg.new field_val[:rm], field_val[:stype], field_val[:shifti]*2
when :i24; Expression[field_val[a] << 2]
when :i8_r
i = field_val[:i8]
@@ -100,14 +99,14 @@ class ARM
o = case a
when :mem_rn_rm; Reg.new(field_val[:rm])
when :mem_rn_i8_12; field_val[:i8_12]
when :mem_rn_rms; Reg.new(field_val[:rm], field_val[:stype], field_val[:shifti])
when :mem_rn_rms; Reg.new(field_val[:rm], field_val[:stype], field_val[:shifti]*2)
when :mem_rn_i12; field_val[:i12]
end
Memref.new(b, o, field_val[:u], op.props[:baseincr])
when :reglist
di.instruction.args.last.updated = true if op.props[:baseincr]
msk = field_val[a]
l = RegList.new((0..15).map { |n| Reg.new(n) if (msk & (1 << n)) > 0 }.compact)
l = RegList.new((0..15).map { |i| Reg.new(i) if (msk & (1 << i)) > 0 }.compact)
l.usermoderegs = true if op.props[:usermoderegs]
l
else raise SyntaxError, "Internal error: invalid argument #{a} in #{op.name}"
@@ -119,7 +118,7 @@ class ARM
end
def decode_instr_interpret(di, addr)
if di.opcode.args[-1] == :i24
if di.opcode.args.include? :i24
di.instruction.args[-1] = Expression[di.instruction.args[-1] + addr + 8]
end
di
@@ -128,7 +127,7 @@ class ARM
def backtrace_binding
@backtrace_binding ||= init_backtrace_binding
end
def init_backtrace_binding
@backtrace_binding ||= {}
end
@@ -141,9 +140,9 @@ class ARM
else arg
end
}
if binding = backtrace_binding[di.opcode.name]
binding[di, *a]
bd = binding[di, *a]
else
puts "unhandled instruction to backtrace: #{di}" if $VERBOSE
# assume nothing except the 1st arg is modified
@@ -155,7 +154,7 @@ class ARM
end
end
def get_xrefs_x(dasm, di)
if di.opcode.props[:setip]
[di.instruction.args.last]
@@ -4,15 +4,15 @@
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/arm/opcodes'
require 'metasm/arm/opcodes'
require 'metasm/encode'
module Metasm
class ARM
def encode_instr_op(program, instr, op)
def encode_instr_op(section, instr, op)
base = op.bin
set_field = lambda { |f, v|
v = v.reduce if v.kind_of?(Expression)
v = v.reduce if v.kind_of? Expression
case f
when :i8_12
base = Expression[base, :|, [[v, :&, 0xf], :|, [[v, :<<, 4], :&, 0xf00]]]
@@ -42,7 +42,7 @@ class ARM
when :rm_is
set_field[:rm, arg.i]
set_field[:stype, arg.stype]
set_field[:shifti, arg.shift]
set_field[:shifti, arg.shift/2]
when :mem_rn_rm, :mem_rn_rms, :mem_rn_i8_12, :mem_rn_i12
set_field[:rn, arg.base.i]
case sym
@@ -61,32 +61,17 @@ class ARM
when :reglist
set_field[sym, arg.list.inject(0) { |rl, r| rl | (1 << r.i) }]
when :i8_r
# XXX doublecheck this
b = arg.reduce & 0xffffffff
r = (0..15).find {
next true if b < 0x100
b = ((b << 2) & 0xffff_ffff) | ((b >> 30) & 3)
false
}
raise EncodeError, "Invalid constant" if not r
r = (0..15).find { next true if b < 0x10 ; b = (b >> 2) | ((b & 3) << 30) }
set_field[:i8, b]
set_field[:rotate, r]
when :i12, :i24
when :i16, :i24
val, mask, shift = arg, @fields_mask[sym], @fields_shift[sym]
end
}
if op.args[-1] == :i24
# convert label name for branch to relative offset
label = program.new_label('l_'+op.name)
target = val
target = target.rexpr if target.kind_of?(Expression) and target.op == :+ and not target.lexpr
val = Expression[[target, :-, [label, :+, 8]], :>>, 2]
EncodedData.new('', :export => { label => 0 }) <<
Expression[base, :|, [[val, :<<, shift], :&, mask]].encode(:u32, @endianness)
else
Expression[base, :|, [[val, :<<, shift], :&, mask]].encode(:u32, @endianness)
end
Expression[base, :|, [[val, :<<, shift], :&, mask]].encode(:u32, @endianness)
end
end
end
@@ -68,5 +68,8 @@ class ARM < CPU
@opcode_list
end
end
class ARM_THUMB < ARM
end
end
+177
View File
@@ -0,0 +1,177 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/arm/main'
module Metasm
class ARM
private
def addop(name, bin, *args)
args << :cond if not args.delete :uncond
o = Opcode.new name, bin
o.args.concat(args & @valid_args)
(args & @valid_props).each { |p| o.props[p] = true }
args.grep(Hash).each { |h| o.props.update h }
# special args -> multiple fields
case (o.args & [:i8_r, :rm_is, :rm_rs, :mem_rn_rm, :mem_rn_i8_12, :mem_rn_rms, :mem_rn_i12]).first
when :i8_r; args << :i8 << :rotate
when :rm_is; args << :rm << :stype << :shifti
when :rm_rs; args << :rm << :stype << :rs
when :mem_rn_rm; args << :rn << :rm << :rsx << :u
when :mem_rn_i8_12; args << :rn << :i8_12 << :u
when :mem_rn_rms; args << :rn << :rm << :stype << :shifti << :u
when :mem_rn_i12; args << :rn << :i12 << :u
end
(args & @fields_mask.keys).each { |f|
o.fields[f] = [@fields_mask[f], @fields_shift[f]]
}
@opcode_list << o
end
def addop_data_s(name, op, a1, a2, *h)
addop name, op | (1 << 25), a1, a2, :i8_r, :rotate, *h
addop name, op, a1, a2, :rm_is, *h
addop name, op | (1 << 4), a1, a2, :rm_rs, *h
end
def addop_data(name, op, a1, a2)
addop_data_s name, op << 21, a1, a2
addop_data_s name+'s', (op << 21) | (1 << 20), a1, a2, :cond_name_off => name.length
end
def addop_load_puw(name, op, *a)
addop name, op, {:baseincr => :post}, :rd, :u, *a
addop name, op | (1 << 24), :rd, :u, *a
addop name, op | (1 << 24) | (1 << 21), {:baseincr => :pre}, :rd, :u, *a
end
def addop_load_lsh_o(name, op)
addop_load_puw name, op, :rsz, :mem_rn_rm, {:cond_name_off => 3}
addop_load_puw name, op | (1 << 22), :mem_rn_i8_12, {:cond_name_off => 3}
end
def addop_load_lsh
op = 9 << 4
addop_load_lsh_o 'strh', op | (1 << 5)
addop_load_lsh_o 'ldrd', op | (1 << 6)
addop_load_lsh_o 'strd', op | (1 << 6) | (1 << 5)
addop_load_lsh_o 'ldrh', op | (1 << 20) | (1 << 5)
addop_load_lsh_o 'ldrsb', op | (1 << 20) | (1 << 6)
addop_load_lsh_o 'ldrsh', op | (1 << 20) | (1 << 6) | (1 << 5)
end
def addop_load_puwt(name, op, *a)
addop_load_puw name, op, *a
addop name+'t', op | (1 << 21), {:baseincr => :post, :cond_name_off => name.length}, :rd, :u, *a
end
def addop_load_o(name, op, *a)
addop_load_puwt name, op, :mem_rn_i12, *a
addop_load_puwt name, op | (1 << 25), :mem_rn_rms, *a
end
def addop_load(name, op)
addop_load_o name, op
addop_load_o name+'b', op | (1 << 22), :cond_name_off => name.length
end
def addop_ldm_go(name, op, *a)
addop name, op, :rn, :reglist, {:cond_name_off => 3}, *a
end
def addop_ldm_w(name, op, *a)
addop_ldm_go name, op, *a # base reg untouched
addop_ldm_go name, op | (1 << 21), {:baseincr => :post}, *a # base updated
end
def addop_ldm_s(name, op)
addop_ldm_w name, op # transfer regs
addop_ldm_w name, op | (1 << 22), :usermoderegs # transfer usermode regs
end
def addop_ldm_p(name, op)
addop_ldm_s name+'a', op # target memory included
addop_ldm_s name+'b', op | (1 << 24) # target memory excluded, transfer starts at next addr
end
def addop_ldm_u(name, op)
addop_ldm_p name+'d', op # transfer made downward
addop_ldm_p name+'i', op | (1 << 23) # transfer made upward
end
def addop_ldm(name, op)
addop_ldm_u name, op
end
# ARMv6 instruction set, aka arm7/arm9
def init_arm_v6
@opcode_list = []
@valid_props << :baseincr << :cond << :cond_name_off << :usermoderegs <<
:tothumb << :tojazelle
@valid_args.concat [:rn, :rd, :rm, :crn, :crd, :crm, :cpn, :reglist, :i24,
:rm_rs, :rm_is, :i8_r, :mem_rn_rm, :mem_rn_i8_12, :mem_rn_rms, :mem_rn_i12]
@fields_mask.update :rn => 0xf, :rd => 0xf, :rs => 0xf, :rm => 0xf,
:crn => 0xf, :crd => 0xf, :crm => 0xf, :cpn => 0xf,
:rnx => 0xf, :rdx => 0xf, :rsx => 0xf,
:shifti => 0x1f, :stype => 3, :rotate => 0xf, :reglist => 0xffff,
:i8 => 0xff, :i12 => 0xfff, :i24 => 0xff_ffff, :i8_12 => 0xf0f,
:u => 1, :mask => 0xf, :sbo => 0xf, :cond => 0xf
@fields_shift.update :rn => 16, :rd => 12, :rs => 8, :rm => 0,
:crn => 16, :crd => 12, :crm => 0, :cpn => 8,
:rnx => 16, :rdx => 12, :rsx => 8,
:shifti => 7, :stype => 5, :rotate => 8, :reglist => 0,
:i8 => 0, :i12 => 0, :i24 => 0, :i8_12 => 0,
:u => 23, :mask => 16, :sbo => 12, :cond => 28
addop_data 'and', 0, :rd, :rn
addop_data 'eor', 1, :rd, :rn
addop_data 'xor', 1, :rd, :rn
addop_data 'sub', 2, :rd, :rn
addop_data 'rsb', 3, :rd, :rn
addop_data 'add', 4, :rd, :rn
addop_data 'adc', 5, :rd, :rn
addop_data 'sbc', 6, :rd, :rn
addop_data 'rsc', 7, :rd, :rn
addop_data 'tst', 8, :rdx, :rn
addop_data 'teq', 9, :rdx, :rn
addop_data 'cmp', 10, :rdx, :rn
addop_data 'cmn', 11, :rdx, :rn
addop_data 'orr', 12, :rd, :rn
addop_data 'or', 12, :rd, :rn
addop_data 'mov', 13, :rd, :rnx
addop_data 'bic', 14, :rd, :rn
addop_data 'mvn', 15, :rd, :rnx
addop 'b', 0b1010 << 24, :setip, :stopexec, :i24
addop 'bl', 0b1011 << 24, :setip, :stopexec, :i24, :saveip
addop 'bkpt', (0b00010010 << 20) | (0b0111 << 4) # other fields are available&unused, also cnd != AL is undef
addop 'blx', 0b1111101 << 25, :setip, :stopexec, :saveip, :tothumb, :h, :nocond, :i24
addop 'blx', (0b00010010 << 20) | (0b0011 << 4), :setip, :stopexec, :saveip, :tothumb, :rm
addop 'bx', (0b00010010 << 20) | (0b0001 << 4), :setip, :stopexec, :rm
addop 'bxj', (0b00010010 << 20) | (0b0010 << 4), :setip, :stopexec, :rm, :tojazelle
addop_load 'str', (1 << 26)
addop_load 'ldr', (1 << 26) | (1 << 20)
addop_load_lsh
addop_ldm 'stm', (1 << 27)
addop_ldm 'ldm', (1 << 27) | (1 << 20)
end
alias init_latest init_arm_v6
end
end
__END__
addop_cond 'mrs', 0b0001000011110000000000000000, :rd
addop_cond 'msr', 0b0001001010011111000000000000, :rd
addop_cond 'msrf', 0b0001001010001111000000000000, :rd
addop_cond 'mul', 0b000000000000001001 << 4, :rd, :rn, :rs, :rm
addop_cond 'mla', 0b100000000000001001 << 4, :rd, :rn, :rs, :rm
addop_cond 'swp', 0b0001000000000000000010010000, :rd, :rn, :rs, :rm
addop_cond 'swpb', 0b0001010000000000000010010000, :rd, :rn, :rs, :rm
addop_cond 'undef', 0b00000110000000000000000000010000
addop_cond 'swi', 0b00001111 << 24
addop_cond 'bkpt', 0b1001000000000000001110000
addop_cond 'movw', 0b0011 << 24, :movwimm
@@ -4,7 +4,7 @@
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/arm/opcodes'
require 'metasm/arm/opcodes'
require 'metasm/parse'
module Metasm
@@ -26,32 +26,24 @@ class ARM
def parse_arg_valid?(op, sym, arg)
case sym
when :rd, :rs, :rn, :rm; arg.kind_of?(Reg) and arg.shift == 0 and (arg.updated ? op.props[:baseincr] : !op.props[:baseincr])
when :rm_rs; arg.kind_of?(Reg) and arg.shift.kind_of?(Reg)
when :rm_is; arg.kind_of?(Reg) and arg.shift.kind_of?(Integer)
when :i12, :i24, :i8_12; arg.kind_of?(Expression)
when :i8_r
if arg.kind_of?(Expression)
b = arg.reduce
!b.kind_of?(Integer) or (0..15).find {
b = ((b << 2) & 0xffff_ffff) | ((b >> 30) & 3)
b < 0x100 }
end
when :rd, :rs, :rn, :rm; arg.kind_of? Reg and arg.shift == 0 and (arg.updated ? op.props[:baseincr] : !op.props[:baseincr])
when :rm_rs; arg.kind_of? Reg and arg.shift.kind_of? Reg
when :rm_is; arg.kind_of? Reg and arg.shift.kind_of? Integer
when :i16, :i24, :i8_12, :i8_r; arg.kind_of? Expression
when :mem_rn_rm, :mem_rn_i8_12, :mem_rn_rms, :mem_rn_i12
os = case sym
when :mem_rn_rm; :rm
when :mem_rn_i8_12; :i8_12
when :mem_rn_rms; :rm_rs
when :mem_rn_i12; :i12
when :mem_rn_i12; :i16
end
arg.kind_of?(Memref) and parse_arg_valid?(op, os, arg.offset)
when :reglist; arg.kind_of?(RegList)
arg.kind_of? Memref and parse_arg_valid?(op, os, arg.offset)
when :reglist; arg.kind_of? RegList
end
# TODO check flags on reglist, check int values
end
def parse_argument(lexer)
raise lexer, "unexpected EOS" if not lexer.nexttok
if Reg.s_to_i[lexer.nexttok.raw]
arg = Reg.new Reg.s_to_i[lexer.readtok.raw]
lexer.skip_space
@@ -70,24 +62,22 @@ class ARM
when '!'
lexer.readtok
arg.updated = true
end if lexer.nexttok
end
elsif lexer.nexttok.raw == '{'
lexer.readtok
arg = RegList.new
loop do
lexer.skip_space
raise "unterminated reglist" if lexer.eos?
lexer.skip_space
if Reg.s_to_i[lexer.nexttok.raw]
arg.list << Reg.new(Reg.s_to_i[lexer.readtok.raw])
lexer.skip_space
raise "unterminated reglist" if lexer.eos?
end
case lexer.nexttok.raw
when ','; lexer.readtok
when '-'
lexer.readtok
lexer.skip_space
raise "unterminated reglist" if lexer.eos?
if not r = Reg.s_to_i[lexer.nexttok.raw]
raise lexer, "reglist parse error: invalid range"
end
@@ -105,22 +95,20 @@ class ARM
end
elsif lexer.nexttok.raw == '['
lexer.readtok
raise "unexpected EOS" if lexer.eos?
if not base = Reg.s_to_i[lexer.nexttok.raw]
raise lexer, 'invalid mem base (reg expected)'
end
base = Reg.new Reg.s_to_i[lexer.readtok.raw]
raise "unexpected EOS" if lexer.eos?
if lexer.nexttok.raw == ']'
lexer.readtok
#closed = true
closed = true
end
if !lexer.nexttok or lexer.nexttok.raw != ','
if lexer.nexttok.raw != ','
raise lexer, 'mem off expected'
end
lexer.readtok
off = parse_argument(lexer)
if not off.kind_of?(Expression) and not off.kind_of?(Reg)
if not off.kind_of? Expression and not off.kind_of? Reg
raise lexer, 'invalid mem off (reg/imm expected)'
end
case lexer.nexttok and lexer.nexttok.raw
@@ -4,7 +4,7 @@
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/render'
require 'metasm/cpu/arm/opcodes'
require 'metasm/arm/opcodes'
module Metasm
class ARM
@@ -19,7 +19,7 @@ class ARM
["#{r} RRX"]
else
case s = @shift
when Integer; s = Expression[s == 0 ? 32 : s] # lsl and ror already accounted for
when Integer; s = Expression[s]
when Reg; s = self.class.i_to_s[s.i]
end
["#{r} #{@stype.to_s.upcase} #{s}"]
+17 -43
View File
@@ -11,14 +11,14 @@ module Metasm
module C
class Parser
def precompile
@toplevel.precompile(Compiler.new(self, @program))
@toplevel.precompile(Compiler.new(self))
self
end
end
# each CPU defines a subclass of this one
class Compiler
# an ExeFormat (mostly used for unique label creation, and cpu.check_reserved_name)
# an ExeFormat (mostly used for unique label creation)
attr_accessor :exeformat
# the C Parser (destroyed by compilation)
attr_accessor :parser
@@ -26,8 +26,6 @@ module C
attr_accessor :source
# list of unique labels generated (to recognize user-defined ones)
attr_accessor :auto_label_list
# map asm name -> original C name (for exports etc)
attr_accessor :label_oldname
attr_accessor :curexpr
# allows 'raise self' (eg struct.offsetof)
@@ -36,11 +34,9 @@ module C
end
# creates a new CCompiler from an ExeFormat and a C Parser
def initialize(parser, exeformat=nil, source=[])
exeformat ||= ExeFormat.new
def initialize(parser, exeformat=ExeFormat.new, source=[])
@parser, @exeformat, @source = parser, exeformat, source
@auto_label_list = {}
@label_oldname = {}
end
def new_label(base='')
@@ -159,9 +155,7 @@ module C
c_init_state(func)
# hide the full @source while compiling, then add prolog/epilog (saves 1 pass)
@source << ''
@source << "#{@label_oldname[func.name]}:" if @label_oldname[func.name]
@source << "#{func.name}:"
@source << '' << "#{func.name}:"
presource, @source = @source, []
c_block(func.initializer)
@@ -252,7 +246,6 @@ module C
w = data.type.align(@parser)
@source << ".align #{align = w}" if w > align
@source << "#{@label_oldname[data.name]}:" if @label_oldname[data.name]
@source << data.name.dup
len = c_idata_inner(data.type, data.initializer)
len %= w
@@ -405,7 +398,6 @@ module C
end
def c_udata(data, align)
@source << "#{@label_oldname[data.name]}:" if @label_oldname[data.name]
@source << "#{data.name} "
@source.last <<
case data.type
@@ -426,11 +418,7 @@ module C
len == 0 ? align : len
end
# return non-nil if the variable name is unsuitable to appear as is in the asm listing
# eg filter out asm instruction names
def check_reserved_name(var)
return true if @exeformat.cpu and @exeformat.cpu.check_reserved_name(var.name)
%w[db dw dd dq].include?(var.name)
end
end
@@ -550,36 +538,21 @@ module C
class Declaration
def precompile(compiler, scope)
if (@var.type.kind_of? Function and @var.initializer and scope != compiler.toplevel) or @var.storage == :static or compiler.check_reserved_name(@var)
# TODO fix label name in export table if __exported
scope.symbol.delete @var.name
old = @var.name
ref = scope.symbol.delete old
if scope == compiler.toplevel or (@var.type.kind_of?(Function) and not @var.initializer)
if n = compiler.label_oldname.index(old)
# reuse same name as predeclarations
@var.name = n
else
newname = old
newname = compiler.new_label newname until newname != old
if not compiler.check_reserved_name(@var)
compiler.label_oldname[newname] = old
end
@var.name = newname
end
ref ||= scope.symbol[@var.name] || @var
# append only one actual declaration for all predecls (the one with init, or the last uninit)
scope.statements << self if ref.eql?(@var)
else
@var.name = compiler.new_label @var.name until @var.name != old
compiler.toplevel.statements << self
end
compiler.toplevel.symbol[@var.name] = ref
@var.name = compiler.new_label @var.name until @var.name != old
compiler.toplevel.symbol[@var.name] = @var
# TODO no pure inline if addrof(func) needed
compiler.toplevel.statements << self unless @var.attributes.to_a.include? 'inline'
else
scope.symbol[@var.name] ||= @var
appendme = true if scope.symbol[@var.name].eql?(@var)
appendme = true
end
if i = @var.initializer
if @var.type.kind_of? Function
if @var.type.type.kind_of? Union
if @var.type.type.kind_of? Struct
s = @var.type.type
v = Variable.new
v.name = compiler.new_label('return_struct_ptr')
@@ -595,7 +568,6 @@ module C
Label.new(i.return_label).precompile(compiler, i)
i.precompile_optimize
# append now so that static dependencies are declared before us
# TODO no pure inline if addrof(func) needed
scope.statements << self if appendme and not @var.attributes.to_a.include? 'inline'
elsif scope != compiler.toplevel and @var.storage != :static
scope.statements << self if appendme
@@ -608,6 +580,7 @@ module C
else
scope.statements << self if appendme
end
end
# turns an initializer to CExpressions in scope.statements
@@ -904,7 +877,7 @@ module C
def precompile(compiler, scope)
if @value
@value = CExpression.new(nil, nil, @value, @value.type) if not @value.kind_of? CExpression
if @value.type.untypedef.kind_of? Union
if @value.type.untypedef.kind_of? Struct
@value = @value.precompile_inner(compiler, scope)
func = scope.function.type
CExpression.new(CExpression.new(nil, :*, func.args.first, @value.type), :'=', @value, @value.type).precompile(compiler, scope)
@@ -1038,7 +1011,7 @@ module C
lexpr = CExpression.precompile_inner(compiler, scope, @lexpr)
@lexpr = nil
@op = nil
if struct.kind_of? Union and (off = struct.offsetof(compiler, @rexpr)) != 0
if struct.kind_of? Struct and (off = struct.offsetof(compiler, @rexpr)) != 0
off = CExpression.new(nil, nil, off, BaseType.new(:int, :unsigned))
@rexpr = CExpression.new(lexpr, :'+', off, lexpr.type)
# ensure the (ptr + value) is not expanded to (ptr + value * sizeof(*ptr))
@@ -1184,7 +1157,7 @@ module C
}
scope.statements << copy_inline[@lexpr.initializer, scope] # body already precompiled
CExpression.new(nil, nil, rval, rval.type).precompile_inner(compiler, scope)
elsif @type.kind_of? Union
elsif @type.kind_of? Struct
var = Variable.new
var.name = compiler.new_label('return_struct')
var.type = @type
@@ -1461,3 +1434,4 @@ module C
end
end
end
-8
View File
@@ -1,8 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2010 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
require 'metasm/cpu/arc/decode'
-425
View File
@@ -1,425 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2010 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/arc/opcodes'
require 'metasm/decode'
module Metasm
class ARC
def major_opcode(val, sz = 16)
return val >> (sz == 16 ? 0xB : 0x1B)
end
def sub_opcode(val)
return ((val >> 16) & 0x3f)
end
def build_opcode_bin_mask(op, sz)
op.bin_mask = 0
op.args.each { |f| op.bin_mask |= @fields_mask[f] << @fields_shift[f]}
op.bin_mask = ((1 << sz)-1) ^ op.bin_mask
end
def build_bin_lookaside
bin_lookaside = {}
opcode_list.each{|mode,oplist|
lookaside = {}
# 2nd level to speed up lookaside for major 5
lookaside[5] = {}
oplist.each { |op|
next if not op.bin.kind_of? Integer
build_opcode_bin_mask(op, mode)
mj = major_opcode(op.bin, mode)
if mode == 32 and mj == 5
(lookaside[mj][sub_opcode(op.bin)] ||= []) << op
else
(lookaside[mj] ||= []) << op
end
}
bin_lookaside[mode] = lookaside
}
bin_lookaside
end
def instruction_size(edata)
val = major_opcode(edata.decode_imm(:u16, @endianness))
edata.ptr -= 2
(val >= 0xC) ? 16 : 32
end
def memref_size(di)
case di.opcode.name
when 'ldb_s', 'stb_s', 'extb_s', 'sexb_s'; 1
when 'ldw_s', 'stw_s', 'extw_s', 'sexw_s'; 2
else 4
end
end
def decode_bin(edata, sz)
case sz
when 16; edata.decode_imm(:u16, @endianness)
when 32
# wordswap
val = edata.decode_imm(:u32, :little)
((val >> 16) & 0xffff) | ((val & 0xffff) << 16)
end
end
def decode_findopcode(edata)
di = DecodedInstruction.new(self)
@instrlength = instruction_size(edata)
val = decode_bin(edata, @instrlength)
edata.ptr -= @instrlength/8
maj = major_opcode(val, @instrlength)
lookaside = @bin_lookaside[@instrlength][maj]
lookaside = lookaside[sub_opcode(val)] if @instrlength == 32 and maj == 5
op = lookaside.select { |opcode|
if $ARC_DEBUG and (val & opcode.bin_mask) == opcode.bin
puts "#{opcode.bin_mask.to_s(16)} - #{opcode.bin.to_s(16)} - #{(val & opcode.bin_mask).to_s(16)} - #{opcode.name} - #{opcode.args}"
end
(val & opcode.bin_mask) == opcode.bin
}
if op.size == 2 and op.first.name == 'mov' and op.last.name == 'nop'
op = op.last
elsif op == nil or op.size != 1
puts "[> I sense a disturbance in the force <]"
op.to_a.each { |opcode| puts "#{opcode.name} - #{opcode.args} - #{Expression[opcode.bin]} - #{Expression[opcode.bin_mask]}" }
puts "current value: #{Expression[val]}"
puts "current value: 0b#{val.to_s(2)}"
op = nil
else
op = op.first
end
di if di.opcode = op
end
Reduced_reg = [0, 1, 2, 3, 12, 13, 14, 15]
def reduced_reg_set(i)
Reduced_reg[i]
end
def decode_instr_op(edata, di)
before_ptr = edata.ptr
op = di.opcode
di.instruction.opname = op.name
val = decode_bin(edata, @instrlength)
field_val = lambda { |f|
r = (val >> @fields_shift[f]) & @fields_mask[f]
case f
# 16-bits instruction operands ------------------------------------------"
when :ca, :cb, :cb2, :cb3, :cc; r = reduced_reg_set(r)
when :ch
r = (((r & 7) << 3) | (r >> 5))
when :@cbu7, :@cbu6, :@cbu5
r = r & 0b11111
r = (f == :@cbu7) ? r << 2 : ( (f == :@cbu6) ? r << 1 : r)
when :cu5ee; r = r << 2
when :cdisps13
r = (Expression.make_signed(r,11) << 2) + ((di.address >> 2) << 2)
when :cdisps10
r = (Expression.make_signed(r, 9) << 1) + ((di.address >> 2) << 2)
when :cdisps8
r = (Expression.make_signed(r, 7) << 1) + ((di.address >> 2) << 2)
when :cdisps7
r = (Expression.make_signed(r, 6) << 1) + ((di.address >> 2) << 2)
when :cs9, :cs10, :cs11
r = Expression.make_signed(r, ((f== :cs11 ? 11 : (f == :cs10 ? 10 : 9) )))
r = (f == :cs11) ? r << 2 : ((f == :cs10) ? r << 1 : r)
when :@cspu7;
r = r << 2
# 32-bits instruction operands ------------------------------------------"
when :b
r = (r >> 12) | ((r & 0x7) << 3)
when :s8e
r = ((r & 0x1) << 7) | (r >> 2)
r = (Expression.make_signed(r, 8) << 1) + ((di.address >> 2) << 2)
when :u6e
r = (r << 1) + ((di.address >> 2) << 2)
when :s9
r = (Expression.make_signed(r, 7) << 1) + ((di.address >> 2) << 2)
when :s12
r = (r >> 6) | ((r & 0x3f) << 6)
r = Expression.make_signed(r, 12)
when :s12e
r = (r >> 6) | ((r & 0x3f) << 6)
r = (Expression.make_signed(r, 12) <<1 ) + ((di.address >> 2) << 2)
when :s21e
r = ((r & 0x3ff) << 10) | (r >> 11)
r = (Expression.make_signed(r, 20) << 1) + ((di.address >> 2) << 2)
when :s21ee # pc-relative
r = ((r & 0x3ff) << 9) | (r >> 12)
r = (Expression.make_signed(r, 19) << 2) + ((di.address >> 2) << 2)
when :s25e # pc-relative
r = ((r & 0xf) << 20) | (((r >> 6) & 0x3ff) << 10) | (r >> 17)
r = (Expression.make_signed(r, 24) << 1) + ((di.address >> 2) << 2)
when :s25ee # pc-relative
r = ((r & 0xf) << 19) | (((r >> 6) & 0x3ff) << 9) | (r >> 18)
r = (Expression.make_signed(r, 23) << 2) + ((di.address >> 2) << 2)
when :@bs9
r = r >> 3
s9 = ((r & 1) << 8) | ((r >> 1) & 0xff)
r = Expression.make_signed(s9, 9)
when :bext, :cext, :@cext
if ((r = field_val[(f == :bext) ? :b : :c]) == 0x3E)
tmp = edata.decode_imm(:u32, :little)
r = Expression[(tmp >> 16) | ((tmp & 0xffff) << 16)]
else
r = GPR.new(r)
end
else r
end
r
}
# decode properties fields
op.args.each { |a|
case a
when :flags15, :flags16
di.instruction.opname += '.f' if field_val[a] != 0
when :ccond
di.instruction.opname += ('.' + @cond_suffix[field_val[a]]) if field_val[a] != 0
when :delay5, :delay16
di.instruction.opname += '.d' if field_val[a] != 0
when :cache5, :cache11, :cache16
di.instruction.opname +='.di' if field_val[a] != 0
when :signext6, :signext16
di.instruction.opname += '.x' if field_val[a] != 0
when :wb3, :wb9, :wb22
case field_val[a]
when 1; di.instruction.opname += ((memref_size(di) == 2) ? '.ab' : '.a')
when 2; di.instruction.opname += '.ab'
when 3; di.instruction.opname += '.as'
end
when :sz1, :sz7, :sz16, :sz17
case field_val[a]
when 1; di.instruction.opname += 'b'
when 2; di.instruction.opname += 'w'
end
else
di.instruction.args << case a
# 16-bits instruction operands ------------------------------------------"
when :cr0; GPR.new 0
when :ca, :cb, :cb2, :cb3, :cc; GPR.new(field_val[a])
when :ch
if ((r = field_val[a]) == 0x3E)
tmp = edata.decode_imm(:u32, :little)
Expression[(tmp >> 16) | ((tmp & 0xffff) << 16)]
else
GPR.new(r)
end
when :@gps9, :@gps10, :@gps11
imm = (a == :@gps11) ? :cs11 : (a == :@gps10) ? :cs10 : :cs9
Memref.new(GPR.new(26), Expression[field_val[imm]], memref_size(di))
when :cu3, :cu5, :cu5ee, :cu6, :cu7, :cu7l, :cu8; Expression[field_val[a]]
when :cs9, :cs10, :cs11; Expression[field_val[a]]
when :cdisps7, :cdisps8, :cdisps10, :cdisps13; Expression[field_val[a]]
when :@cb; Memref.new(GPR.new(field_val[:cb]), nil, memref_size(di))
when :@cbu7, :@cbu6, :@cbu5; Memref.new(GPR.new(field_val[:cb]), Expression[field_val[a]], memref_size(di))
when :@cspu7; Memref.new(GPR.new(28), field_val[a], memref_size(di))
when :@cbcc; Memref.new(field_val[:cb], field_val[:cc], memref_size(di))
# 32-bits instruction operands ------------------------------------------"
when :a, :b
((r = field_val[a]) == 0x3E) ? :zero : GPR.new(r)
when :b2; GPR.new field_val[:b]
when :c; GPR.new field_val[a]
when :bext, :cext; field_val[a]
when :@cext
target = field_val[a]
(di.opcode.props[:setip] and target.kind_of? GPR) ? Memref.new(target, nil, memref_size(di)) : target
when :@bextcext
tmp = field_val[a]
#c = tmp & 0x3F
tmp = tmp >> 6
b = (tmp >> 12) | ((tmp & 0x7) << 3)
Memref.new(field_val[:bext], field_val[:cext], memref_size(di))
when :u6, :u6e, :s8e, :s9, :s12; Expression[field_val[a]]
when :s12e, :s21e, :s21ee, :s25e, :s25ee; Expression[field_val[a]]
when :auxs12; AUX.new field_val[:s12]
when :@c; Memref.new(GPR.new(field_val[a]), nil, memref_size(di))
when :@bcext; Memref.new(field_val[a], nil, memref_size(di))
when :@bcext; Memref.new(field_val[:b], field_val[:cext], memref_size(di))
when :@bs9
# [b,s9] or [limm] if b = 0x3E
base = field_val[:bext]
Memref.new(base, (base.kind_of? GPR) ? Expression[field_val[a]] : nil, memref_size(di))
# common instruction operands ------------------------------------------"
when :zero; Expression[0]
when :gp; GPR.new(26)
when :sp, :sp2; GPR.new(28)
when :blink; GPR.new(31)
when :@ilink1; Memref.new(GPR.new(29), nil, memref_size(di))
when :@ilink2; Memref.new(GPR.new(30), nil, memref_size(di))
when :@blink; Memref.new(GPR.new(31), nil, memref_size(di))
else raise SyntaxError, "Internal error: invalid argument #{a} in #{op.name}"
end
end
}
di.bin_length += edata.ptr - before_ptr
return if edata.ptr > edata.virtsize
di
end
def disassembler_default_func
df = DecodedFunction.new
df.backtrace_binding = {}
15.times { |i|
df.backtrace_binding["r#{i}".to_sym] = Expression::Unknown
}
df.backtracked_for = []
df.btfor_callback = lambda { |dasm, btfor, funcaddr, calladdr|
if funcaddr != :default
btfor
elsif di = dasm.decoded[calladdr] and di.opcode.props[:saveip]
btfor
else []
end
}
df
end
REG_SYMS = [:r26, :r27, :r28, :r29, :r30, :r31, :r60]
def register_symbols
REG_SYMS
end
def backtrace_binding
@backtrace_binding ||= init_backtrace_binding
end
def opshift(op)
op[/\d/].to_i
end
def with_res(arg)
arg != :zero
end
def init_backtrace_binding
sp = :r28
blink = :r31
@backtrace_binding ||= {}
mask = lambda { |sz| (1 << sz)-1 } # 32bits => 0xffff_ffff
opcode_list.each{|mode, oplist|
oplist.map { |ol| ol.name }.uniq.each { |op|
binding = case op
when /^add/, /^sub/
lambda { |di, a0, a1, a2|
if (shift = opshift(op)) == 0
{ a0 => Expression[[a1, :+, a2], :&, mask[32]] }
else
{ a0 => Expression[[a1, :+, [a2, :<<, shift]], :&, mask[32]] }
end
}
when /^and/
lambda { |di, a0, a1, a2| { a0 => Expression[a1, :&, a2] } }
when /^asl/
lambda { |di, *a| { a[0] => Expression[[a[1], :<<, (a[2] ? a[2]:1)], :&, mask[32]] } }
when /^bxor/
lambda { |di, a0, a1, a2| { a0 => Expression[a1, :^, [1, :<<, a2]] }}
when /^bclr/; lambda { |di, a0, a1, a2| { a0 => Expression[a1, :&, Expression[mask[32], :^, Expression[1, :<<, a2]]] } }
when /^bset/; lambda { |di, a0, a1, a2| { a0 => Expression[a1, :|, Expression[1, :<<, a2]] } }
when /^jl/; lambda { |di, a0| { blink => Expression[di.next_addr] } }
when 'bl', 'bl_s', /^bl\./
# FIXME handle delay slot
# "This address is taken either from the first instruction following the branch (current PC) or the
# instruction after that (next PC) according to the delay slot mode (.d)."
lambda { |di, a0| { blink => Expression[di.next_addr] } }
when /^mov/, /^lr/, /^ld/; lambda { |di, a0, a1| { a0 => a1 } }
when /^neg/; lambda { |di, a0, a1| { a0 => Expression[[0, :-, a1], :&, mask[32]] } }
when /^not/; lambda { |di, a0, a1| { a0 => Expression[[:~, a1], :&, mask[32]] } }
when /^or/; lambda { |di, a0, a1, a2| { a0 => Expression[a1, :|, a2] } }
when /^st/, /^sr/; lambda { |di, a0, a1| { a1 => a0 } }
when /^ex/; lambda { |di, a0, a1| { a1 => a0 , a0 => a1 } }
when 'push_s'
lambda { |di, a0| {
sp => Expression[sp, :-, 4],
Indirection[sp, @size/8, di.address] => Expression[a0]
} }
when 'pop_s'
lambda { |di, a0| {
a0 => Indirection[sp, @size/8, di.address],
sp => Expression[sp, :+, 4]
} }
end
@backtrace_binding[op] ||= binding if binding
}
}
@backtrace_binding
end
def get_backtrace_binding(di)
a = di.instruction.args.map { |arg|
case arg
when GPR; arg.symbolic
when Memref; arg.symbolic(di.address)
else arg
end
}
if binding = backtrace_binding[di.opcode.basename]
binding[di, *a]
else
puts "unhandled instruction to backtrace: #{di}" if $VERBOSE
{ :incomplete_binding => Expression[1] }
end
end
def get_xrefs_x(dasm, di)
return [] if not di.opcode.props[:setip]
arg = case di.opcode.name
when 'b', 'b_s', /^j/, /^bl/, /^br/, 'lp'
expr = di.instruction.args.last
expr.kind_of?(Memref) ? expr.base : expr
else di.instruction.args.last
end
[Expression[(arg.kind_of?(Reg) ? arg.symbolic : arg)]]
end
def backtrace_is_function_return(expr, di=nil)
Expression[expr].reduce == Expression[register_symbols[5]]
end
def delay_slot(di=nil)
return 0 if (not di) or (not di.opcode.props[:setip])
return 1 if di.opcode.props[:delay_slot]
(di.instruction.opname =~ /\.d/) ? 0 : 1
end
end
end
-191
View File
@@ -1,191 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2010 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
module Metasm
class ARC < CPU
def initialize(e = :little)
super()
@endianness = e
@size = 32
end
class Reg
include Renderable
attr_accessor :i
def initialize(i); @i = i end
def ==(o)
o.class == self.class and o.i == i
end
end
# general purpose reg
# Result R0-R1
# Arguments R0-R7
# Caller Saved Registers R0-R12
# Callee Saved Registers R13-R25
# Static chain pointer (if required) R11
# Register for temp calculation R12
# Global Pointer R26 (GP)
# Frame Pointer R27 (FP)
# Stack Pointer R28 (SP)
# Interrupt Link Register 1 R29 (ILINK1)
# Interrupt Link Register 2 R30 (ILINK2)
# Branch Link Register R31 (BLINK)
class GPR < Reg
Sym = (0..64).map { |i| "r#{i}".to_sym }
def symbolic; Sym[@i] end
Render = {
26 => 'gp', # global pointer, used to point to small sets of shared data throughout execution of a program
27 => 'fp', # frame pointer
28 => 'sp', # stak pointer
29 => 'ilink1', # maskable interrupt link register
30 => 'ilink2', # maskable interrupt link register 2
31 => 'blink', # branch link register
60 => 'lp_count', # loop count register (24 bits)
# "When a destination register is set to r62 there is no destination for the result of the instruction so the
# result is discarded. Any flag updates will still occur according to the set flags directive (.F or implicit
# in the instruction)."
62 => 'zero'
}
def render
if s = Render[i]
[s]
else
# r0-r28 general purpose registers
# r32-r59 reserved for extentions
["r#@i"]
end
end
end
class AUX < Reg
def symbolic; "aux#{i}".to_sym end
Render = {
0x00 => 'status', # Status register (Original ARCtangent-A4 processor format)
0x01 => 'semaphore', # Inter-process/Host semaphore register
0x02 => 'lp_start', # Loop start address (32-bit)
0x03 => 'lp_end', # Loop end address (32-bit)
0x04 => 'identity', # Processor Identification register
0x05 => 'debug', # Debug register
0x06 => 'pc', # PC register (32-bit)
0x0A => 'status32', # Status register (32-bit)
0x0B => 'status32_l1', # Status register save for level 1 interrupts
0x0C => 'status32_l2', # Status register save for level 2 interrupts
0x10 => 'ic_ivic', # Cache invalidate
0x11 => 'ic_ctrl', # Mode bits for cache controller
0x12 => 'mulhi', # High part of Multiply
0x19 => 'ic_ivil',
0x21 => 'timer0_cnt', # Processor Timer 0 Count value
0x22 => 'timer0_ctrl', # Processor Timer 0 Control value
0x23 => 'timer0_limit', # Processor Timer 0 Limit value
0x25 => 'int_vector_base', # Interrupt Vector Base address
0x40 => 'im_set_dc_ctrl',
0x41 => 'aux_macmode', # Extended Arithmetic Status and Mode
0x43 => 'aux_irq_lv12', # Interrupt Level Status
0x47 => 'dc_ivdc', # Invalidate cache
0x48 => 'dc_ctrl', # Cache control register
0x49 => 'dc_ldl', # Lock data line
0x4A => 'dc_ivdl', # Invalidate data line
0x4B => 'dc_flsh', # Flush data cache
0x4C => 'dc_fldl', # Flush data line
0x58 => 'dc_ram_addr', # Access RAM address
0x59 => 'dc_tag', # Tag Access
0x5A => 'dc_wp', # Way Pointer Access
0x5B => 'dc_data', # Data Access
0x62 => 'crc_bcr',
0x64 => 'dvfb_bcr',
0x65 => 'extarith_bcr',
0x68 => 'vecbase_bcr',
0x69 => 'perbase_bcr',
0x6f => 'mmu_bcr',
0x72 => 'd_cache_build', # Build: Data Cache
0x73 => 'madi_build', # Build: Multiple ARC Debug I/F
0x74 => 'ldstram_build', # Build: LD/ST RAM
0x75 => 'timer_build', # Build: Timer
0x76 => 'ap_build', # Build: Actionpoints
0x77 => 'i_cache_build', # Build: I-Cache
0x78 => 'addsub_build', # Build: Saturated Add/Sub
0x79 => 'dspram_build', # Build: Scratch RAM & XY Memory
0x7B => 'multiply_build', # Build: Multiply
0x7C => 'swap_build', # Build: Swap
0x7D => 'norm_build', # Build: Normalise
0x7E => 'minmax_build', # Build: Min/Max
0x7F => 'barrel_build', # Build: Barrel Shift
0x100 => 'timer1_cnt', # Processor Timer 1 Count value
0x101 => 'timer1_ctrl', # Processor Timer 1 Control value
0x102 => 'timer1_limit', # Processor Timer 1 Limit value
0x200 => 'aux_irq_lev', # Interrupt Level Programming
0x201 => 'aux_irq_hint', # Software Triggered Interrupt
0x202 => 'aux_irq_mask', # Masked bits for Interrupts
0x203 => 'aux_irq_base', # Interrupt Vector base address
0x400 => 'eret', # Exception Return Address
0x401 => 'erbta', # Exception Return Branch Target Address
0x402 => 'erstatus', # Exception Return Status
0x403 => 'ecr', # Exception Cause Register
0x404 => 'efa', # Exception Fault Address
0x40A => 'icause1', # Level 1 Interrupt Cause Register
0x40B => 'icause2', # Level 2 Interrupt Cause Register
0x40C => 'aux_ienable', # Interrupt Mask Programming
0x40D => 'aux_itrigger', # Interrupt Sensitivity Programming
0x410 => 'xpu', # User Mode Extension Enables
0x412 => 'bta', # Branch Target Address
0x413 => 'bta_l1', # Level 1 Return Branch Target
0x414 => 'bta_l2', # Level 2 Return Branch Target
0x415 => 'aux_irq_pulse_cancel', # Interrupt Pulse Cancel
0x416 => 'aux_irq_pending', # Interrupt Pending Register
}
def render
if s = Render[i]
[s]
else
["aux#@i"]
end
end
end
class Memref
attr_accessor :base, :disp
def initialize(base, disp, sz)
@base, @disp, @size = base, disp, sz
end
def symbolic(orig)
b = @base
b = b.symbolic if b.kind_of? Reg
if disp
o = @disp
o = o.symbolic if o.kind_of? Reg
e = Expression[b, :+, o].reduce
else
e = Expression[b].reduce
end
Indirection[e, @size, orig]
end
include Renderable
def render
if @disp and @disp != 0
['[', @base, ', ', @disp, ']']
else
['[', @base, ']']
end
end
end
end
end
-588
View File
@@ -1,588 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2010 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/arc/main'
module Metasm
class ARC
def addop32(name, bin, *args)
addop(:ac32, name, bin, *args)
end
def addop16(name, bin, *args)
addop(:ac16, name, bin, *args)
end
def addop(mode, name, bin, *args)
o = Opcode.new(name)
o.bin = bin
args.each { |a|
o.args << a if @fields_mask[a]
o.props[a] = true if @valid_props[a]
o.fields[a] = [@fields_mask[a], @fields_shift[a]] if @fields_mask[a]
}
(mode == :ac16) ? (@opcode_list16 << o) : (@opcode_list32 << o)
end
def init_opcode_list
@opcode_list16 = []
@opcode_list32 = []
@valid_props.update :flag_update => true, :delay_slot => true
@cond_suffix = [''] + %w[z nz p n cs cc vs vc gt ge lt le hi ls pnz]
#The remaining 16 condition codes (10-1F) are available for extension
@cond_suffix += (0x10..0x1f).map{ |i| "extcc#{i.to_s(16)}" }
# Compact 16-bits operands field masks
fields_mask16 = {
:ca => 0x7, :cb => 0x7, :cb2 => 0x7, :cb3 => 0x7, :cc => 0x7,
:cu => 0x1f,
:ch => 0b11100111,
# immediate (un)signed
:cu3 => 0x7, :cu8 => 0xff,
# cu7 is 32-bit aligned, cu6 is 16-bit aligned, cu6 is 8-bit aligned
:cu5 => 0x1f, :cu5ee => 0x1f, :cu6 => 0x3f, :cu7 => 0x7f,
:cs9 => 0x1ff, :cs9ee => 0x1ff, :cs10 => 0x1ff, :cs11 => 0x1ff,
# signed displacement
:cdisps7=> 0x3f, :cdisps8 => 0x7f, :cdisps10 => 0x1ff, :cdisps13 => 0x7FF,
# memref [b+u], [sp,u], etc.
:@cb => 0x7, :@cbu7 => 0b11100011111, :@cbu6 => 0b11100011111, :@cbu5 => 0b11100011111,
:@cspu7 => 0b11111, :@cbcc => 0b111111,
:@gps9 => 0x1ff, :@gps10 => 0x1ff, :@gps11 => 0x1ff,
# implicit operands
:climm => 0x0, :cr0 => 0x0,
:blink => 0x0, :@blink => 0x0, :gp => 0x0, :sp => 0x0, :sp2 => 0x0, :zero => 0x0
}
fields_shift16 = {
:ca => 0x0, :cb => 0x8, :cb2 => 0x8, :cb3 => 0x8, :cc => 0x5,
:cu => 0x0,
# immediate (un)signed
:ch => 0x0,
:cu3 => 0x0, :cu5 => 0, :cu5ee => 0, :cu6 => 5, :cu7 => 0x0, :cu8 => 0x0,
:cs9 => 0x0, :cs9ee => 0x0, :cs10 => 0x0, :cs11 => 0x0,
# signed displacement
:cdisps7=> 0x0, :cdisps8 => 0x0, :cdisps10 => 0x0, :cdisps13 => 0x0,
# memref [b+u]
:@cb => 0x8, :@cbu7 => 0x0, :@cbu6 => 0x0, :@cbu5 => 0x0,
:@cspu7 => 0x0, :@cbcc => 0x5,
:@gps9 => 0x0, :@gps10 => 0x0, :@gps11 => 0x0,
# implicit operands
:climm => 0x0, :cr0 => 0x0,
:blink => 0x0, :@blink => 0x0, :gp => 0x0, :sp => 0x0, :sp2 => 0x0, :zero => 0x0,
}
fields_mask32 = {
:a => 0x3f, :b => 0b111000000000111, :bext => 0b111000000000111,
:c => 0x3f, :@c => 0x3f, :cext => 0x3f, :@cext => 0x3f,
:u6 => 0x3f, :u6e => 0x3f,
:s8e => 0x1fd, :s9 => 0x7f,
:s12 => 0xfff, :s12e => 0xfff,
:s21e => 0x1ffBff, :s21ee => 0x1ff3ff,
:s25e => 0x7feffcf, :s25ee => 0x7fcffcf,
:@bs9 => 0x7fff, :@bc => 0x1ff, :@bextcext => 0x1C01FF,
:limm => 0x0, :@limm => 0x0,
:@limmc => 0x3f, :@blimm => 0x7,
:auxlimm => 0x0, :auxs12 => 0xfff,
:ccond => 0x1f, #condition codes
:delay5 => 1, :delay16 => 1,# delay slot
:flags15 => 0x1, :flags16 => 0x1,
:signext6 => 0x1, :signext16 => 0x1,
:cache5 => 0x1, :cache11 => 0x1, :cache16 => 0x1, # data cache mode field
:sz1 => 0x3, :sz7 => 0x3, :sz16 => 0x3, :sz17 => 0x3, #data size field
:wb3 => 0x3, :wb9 => 0x3, :wb22 => 0x3, #write-back flag
:zero => 0x0, :b2 => 0x0, :@ilink1 => 0x0, :@ilink2 => 0x0
}
#FIXME
fields_shift32 = {
:a => 0x0, :b => 0xC, :bext => 0xC,
:c => 0x6, :@c => 0x6, :cext => 0x6, :@cext => 0x6,
:u6 => 0x6, :u6e =>0x6,
:s8e => 15, :s9 => 0x11,
:s12 => 0x0, :s12e => 0,
:s21e => 0x6, :s21ee => 0x6,
:s25e => 0, :s25ee => 0,
:limm => 0x0, :@limm => 0x0,
:@limmc => 0x6, :@blimm => 0x18,
:auxlimm => 0x0, :auxs12 => 0,
:@bs9 => 12, :@bc => 6, :@bextcext => 6,
:ccond => 0, #condition codes
:delay5 => 5, :delay16 => 16,# delay slot
:flags15 => 15, :flags16 => 16,
:signext6 => 6, :signext16 => 16,
:cache5 => 5, :cache11 => 11, :cache16 => 16, # data cache mode field
:sz1 => 1, :sz7 => 7, :sz16 => 16, :sz17 => 17, #data size field
:wb3 => 3, :wb9 => 9, :wb22 => 22, #write-back flag
:zero => 0x0, :b2 => 0x0, :@ilink1 => 0, :@ilink2 => 0,
}
@fields_mask = fields_mask16.merge(fields_mask32)
@fields_shift = fields_shift16.merge(fields_shift32)
init_arc_compact16()
init_arc_compact32()
{16 => @opcode_list16, 32 => @opcode_list32}
end
def add_artihm_op(op, majorcode, subcode, *flags)
# 0bxxxxxbbb00xxxxxxFBBBCCCCCCAAAAAA
addop32 op, 0b00000000000000000000000000000000 | majorcode << 0x1b | subcode << 16, :a, :bext, :cext, :flags15
# 0bxxxxxbbb01xxxxxxFBBBuuuuuuAAAAAA
addop32 op, 0b00000000010000000000000000000000 | majorcode << 0x1b | subcode << 16, :a, :b, :u6, :flags15
# 0bxxxxxbbb10xxxxxxFBBBssssssSSSSSS
addop32 op, 0b00000000100000000000000000000000 | majorcode << 0x1b | subcode << 16, :b, :b2, :s12, :flags15
# 0bxxxxxbbb11xxxxxxFBBBCCCCCC0QQQQQ
addop32 op, 0b00000000110000000000000000000000 | majorcode << 0x1b | subcode << 16, :b, :b2, :cext, :ccond, :flags15
# 0bxxxxxbbb11xxxxxxFBBBuuuuuu1QQQQQ
addop32 op, 0b00000000110000000000000000100000 | majorcode << 0x1b | subcode << 16, :b, :b2, :u6, :ccond, :flags15
end
def add_logical_op(op, majorcode, subcode, *flags)
# 0b00100bbb00xxxxxxFBBBCCCCCCAAAAAA
addop32 op, 0b00100000000000000000000000000000 | majorcode << 0x1b | subcode << 16, :a, :bext, :c, :flags15
# 0b00100bbb01xxxxxxFBBBuuuuuuAAAAAA
addop32 op, 0b00100000010000000000000000000000 | majorcode << 0x1b | subcode << 16, :a, :b, :u6, :flags15
# 0b00100bbb11xxxxxxFBBBCCCCCC0QQQQQ
# WTF
addop32 op, 0b00100000110000000000000000000000 | majorcode << 0x1b | subcode << 16, :b, :b2, :c, :ccond, :flags15
# 0b00100bbb11xxxxxxFBBBuuuuuu1QQQQQ
addop32 op, 0b00100000110000000000000000100000 | majorcode << 0x1b | subcode << 16, :b, :b2, :u6, :ccond, :flags15
end
def add_artihm_op_reduce(op, majorcode, subcode)
# 0bxxxxxbbb00101111FBBBCCCCCCxxxxxx
addop32 op, 0b00000000001011110000000000000000 | majorcode << 0x1b | subcode, :b, :cext, :flags15
# 0bxxxxxbbb01101111FBBBuuuuuuxxxxxx
addop32 op, 0b00000000011011110000000000000000 | majorcode << 0x1b | subcode, :b, :u6, :flags15
end
def add_condbranch_op(op, ccond)
# 0b00001bbbsssssss1SBBBUUUUUUN0xxxx
addop32 op, 0b00001000000000010000000000000000 | ccond, :bext, :cext, :s8e, :setip, :delay5
# 0b00001bbbsssssss1SBBBUUUUUUN1xxxx
addop32 op, 0b00001000000000010000000000010000 | ccond, :b, :u6, :s8e, :setip, :delay5
end
def add_condjmp_op()
# 0b00100RRR1110000D0RRRCCCCCC0QQQQQ
addop32 'j', 0b00100000111000000000000000000000, :@cext, :ccond, :setip, :delay16
# 0b00100RRR1110000D0RRRuuuuuu1QQQQQ
addop32 'j', 0b00100000111000000000000000100000, :u6, :ccond, :setip, :delay16
# 0b00100RRR111000001RRR0111010QQQQQ
addop32 'j', 0b00100000111000001000011101000000, :@ilink1, :ccond, :setip, :flag_update
# 0b00100RRR111000001RRR0111100QQQQQ
addop32 'j', 0b00100000111000001000011110000000, :@ilink2, :ccond, :setip, :flag_update
end
def add_condjmplink_op()
# 0b00100RRR111000100RRRCCCCCC0QQQQQ
addop32 'jl', 0b00100000111000100000000000000000, :@cext, :ccond, :setip, :saveip, :delay16
# 0b00100RRR111000100RRRuuuuuu1QQQQQ
addop32 'jl', 0b00100000111000100000000000100000, :u6, :ccond, :setip, :saveip, :delay16
end
def init_arc_compact32
add_artihm_op_reduce 'abs', 0b00100, 0b001001
add_artihm_op_reduce 'abss', 0b00101, 0b000101
add_artihm_op_reduce 'abssw', 0b00101, 0b000100
add_artihm_op 'adc', 0b00100, 0b000001
add_artihm_op 'add', 0b00100, 0b000000
add_artihm_op 'add1', 0b00100, 0b010100
add_artihm_op 'add2', 0b00100, 0b010101
add_artihm_op 'add3', 0b00100, 0b010110
add_artihm_op 'adds', 0b00101, 0b000110
add_artihm_op 'addsw', 0b00101, 0b010101, :extended
add_artihm_op 'addsdw',0b00101, 0b101000, :extended
add_artihm_op 'and' ,0b00100, 0b000100
add_artihm_op_reduce 'asl', 0b00100, 0b000000
add_artihm_op 'asl', 0b00101, 0b000000, :extended
add_artihm_op 'asls', 0b00101, 0b001010, :extended
add_artihm_op_reduce 'asr', 0b00100, 0b000001
add_artihm_op 'asr', 0b00101, 0b000010
add_artihm_op 'asrs', 0b00101, 0b001011
# 0b00001bbbsssssss1SBBBCCCCCCN01110
addop32 'bbit0', 0b00001000000000010000000000001110, :b, :c, :s9, :delay5, :setip
# 0b00001bbbsssssss1SBBBuuuuuuN11110
addop32 'bbit0', 0b00001000000000010000000000011110, :b, :u6, :s9, :delay5, :setip
# 0b00001bbbsssssss1SBBBCCCCCCN01111
addop32 'bbit1', 0b00001000000000010000000000001111, :b, :c, :s9, :delay5, :setip
# 0b00001bbbsssssss1SBBBuuuuuuN11111
addop32 'bbit1', 0b00001000000000010000000000011111, :b, :u6, :s9, :delay5, :setip
# 0b00000ssssssssss0SSSSSSSSSSNQQQQQ
addop32 'b', 0b00000000000000000000000000000000, :s21e, :ccond, :delay5, :setip
# 0b00000ssssssssss1SSSSSSSSSSNRtttt
addop32 'b', 0b00000000000000010000000000000000, :s25e, :delay5, :setip, :stopexec
# WTF: unknown encoding, bit 5 should be reserved
addop32 'b', 0b00000000000000010000000000010000, :s25e, :delay5, :setip, :stopexec
add_logical_op 'bclr', 0b00100, 0b010000
add_artihm_op 'bic', 0b00100, 0b000110
# 0b00001sssssssss00SSSSSSSSSSNQQQQQ
addop32 'bl', 0b00001000000000000000000000000000, :s21ee, :ccond, :delay5, :setip, :saveip
# 0b00001sssssssss10SSSSSSSSSSNRtttt
addop32 'bl', 0b00001000000000100000000000000000, :s25ee, :delay5, :setip, :saveip, :stopexec
add_logical_op 'bmsk', 0b00100, 0b010011
add_condbranch_op 'breq', 0b0000
add_condbranch_op 'brne', 0b0001
add_condbranch_op 'brlt', 0b0010
add_condbranch_op 'brge', 0b0011
add_condbranch_op 'brlo', 0b0100
add_condbranch_op 'brhs', 0b0101
addop32 'brk', 0b00100101011011110000000000111111, :stopexec
add_logical_op 'bset', 0b00100, 0b001111
# 0b00100bbb110100011BBBCCCCCC0QQQQQ
addop32 'btst', 0b00100000110100011000000000000000, :bext, :c, :ccond
# 0b00100bbb110100011BBBuuuuuu1QQQQQ
addop32 'btst', 0b00100000110100011000000000100000, :b, :u6, :ccond
# WTF 0b00100bbb010100011BBBuuuuuu0QQQQQ
addop32 'btst', 0b00100000010100011000000000000000, :b, :u6, :ccond
add_logical_op 'bxor', 0b00100, 0b010010
# 0b00100bbb100011001BBBssssssSSSSSS
addop32 'cmp', 0b00100000100011001000000000000000, :b, :s12
# WTF unknown encoding ...
# 0b00100bbb010011001BBBssssssSSSSSS
addop32 'cmp', 0b00100000010011001000000000000000, :b, :s12
# 0b00100bbb110011001BBBuuuuuu1QQQQQ
addop32 'cmp', 0b00100000110011001000000000100000, :b, :u6, :ccond
# WTF unknown encoding ...
# 0b00100bbb010011001BBBssssssSSSSSS
addop32 'cmp', 0b00100000000011001000000000000000, :bext, :cext, :ccond
# 0b00100bbb110011001BBBCCCCCC0QQQQQ
addop32 'cmp', 0b00100000110011001000000000000000, :bext, :cext, :ccond
add_artihm_op 'divaw', 0b00101, 0b001000, :extended
# 0b00100bbb00101111DBBBCCCCCC001100
addop32 'ex', 0b00100000001011110000000000001100, :b, :@cext, :cache16
# 0b00100bbb01101111DBBBuuuuuu001100
addop32 'ex', 0b00100000011011110000000000001100, :b, :@u6, :cache16
add_artihm_op_reduce 'extb', 0b00100, 0b000111
add_artihm_op_reduce 'extw', 0b00100, 0b001000
# WTF unknown encoding ...
# 0b00100rrr111010010RRRCCCCCC0QQQQQ
addop32 'flag', 0b00100000001010010000000000000000, :cext, :ccond, :flag_update
# 0b00100rrr111010010RRRuuuuuu1QQQQQ
addop32 'flag', 0b00100000001010010000000000100000, :u6, :ccond, :flag_update
# 0b00100rrr101010010RRRssssssSSSSSS
addop32 'flag', 0b00100000011010010000000000000000, :s12, :flag_update
add_condjmp_op()
add_condjmplink_op()
# 0b00100RRR001000000RRRCCCCCCRRRRRR
addop32 'j', 0b00100000001000000000000000000000, :@cext, :delay16, :setip, :stopexec
# 0b00100RRR011000000RRRuuuuuuRRRRRR
addop32 'j', 0b00100000011000000000000000000000, :u6, :delay16, :setip, :stopexec
# 0b00100RRR101000000RRRssssssSSSSSS
addop32 'j', 0b00100000101000000000000000000000, :s12, :delay16, :setip, :stopexec
# 0b00100RRR001000001RRR011101RRRRRR
addop32 'j.f', 0b00100000001000001000011101000000, :@ilink1, :flag_update, :setip, :stopexec
# 0b00100RRR001000001RRR011110RRRRRR
addop32 'j.f', 0b00100000001000001000011110000000, :@ilink2, :flag_update, :setip, :stopexec
# 0b00100RRR0010001D0RRRCCCCCCRRRRRR
addop32 'jl', 0b00100000001000100000000000000000, :@cext, :delay16, :setip, :saveip, :stopexec
# 0b00100RRR0110001D0RRRuuuuuuRRRRRR
addop32 'jl', 0b00100000011000100000000000000000, :u6, :delay16, :setip, :saveip, :stopexec
# 0b00100RRR1010001D0RRRssssssSSSSSS
addop32 'jl', 0b00100000101000100000000000000000, :s12, :delay16, :setip, :saveip, :stopexec
# 0b00010bbbssssssssSBBBDaaZZXAAAAAA
addop32 'ld', 0b00010000000000000000000000000000, :a, :@bs9, :sz7, :signext6, :wb9, :cache11
# 0b00100bbbaa110ZZXDBBBCCCCCCAAAAAA
addop32 'ld', 0b00100000001100000000000000000000, :a, :@bextcext, :sz17, :signext16, :wb22, :cache11
# 0b00100RRR111010000RRRuuuuuu1QQQQQ
addop32 'lp', 0b00100000111010000000000000100000, :u6e, :ccond, :setip
# 0b00100RRR101010000RRRssssssSSSSSS
addop32 'lp', 0b00100000101010000000000000000000, :s12e, :setip
# 0b00100bbb001010100BBBCCCCCCRRRRRR
addop32 'lr', 0b00100000101010100000000000000000, :b, :@c
# 0b00100bbb001010100BBB111110RRRRRR
addop32 'lr', 0b00100000001010100000111110000000, :b, :auxlimm
# 0b00100bbb101010100BBBssssssSSSSSS
addop32 'lr', 0b00100000011010100000000000000000, :b, :auxs12
# WTF unknown encoding ...
# 0b00100bbb101010100BBBssssssSSSSSS
addop32 'lr', 0b00100000101010100000000000000000, :b, :auxs12
add_artihm_op_reduce 'lsr', 0b00100, 0b000010
add_artihm_op 'lsr', 0b00101, 0b000001
add_artihm_op 'max', 0b00100, 0b001000
add_artihm_op 'min', 0b00100, 0b001001
# 0b00100bbb10001010FBBBssssssSSSSSS
addop32 'mov', 0b00100000100010100000000000000000, :b, :s12, :flags15
# WTF unknown encoding ...
# 0b00100bbb01001010FBBBssssssSSSSSS
addop32 'mov', 0b00100000010010100000000000000000, :b, :s12, :flags15
# 0b00100bbb11001010FBBBCCCCCC0QQQQQ
addop32 'mov', 0b00100000110010100000000000000000, :b, :cext, :ccond , :flags15
# WTF unknown encoding ..
# 0b00100bbb00001010FBBBCCCCCC0QQQQQ
addop32 'mov', 0b00100000000010100000000000000000, :b, :cext, :ccond , :flags15
# 0b00100bbb11001010FBBBuuuuuu1QQQQQ
addop32 'mov', 0b00100000110010100000000000100000, :b, :u6, :ccond , :flags15
add_artihm_op 'mpy', 0b00100, 0b011010, :extended
add_artihm_op 'mpyh', 0b00100, 0b011011, :extended
add_artihm_op 'mpyhu', 0b00100, 0b011100, :extended
add_artihm_op 'mpyu', 0b00100, 0b011101, :extended
# WTF: neg instruction is not differenciated from a rsub :a, :b, :u6
# : 0b00100bbb01001110FBBB000000AAAAAA
#addop32 'neg', 0b00100000010011100000000000000000, :a, :b, :flags15
# WTF: neg instruction is not differenciated from a rsub :b, :b2, :u6
# 0b00100bbb11001110FBBB0000001QQQQQ
#addop32 'neg', 0b00100000110011100000000000100000, :b, :b2, :ccond , :flags15
add_artihm_op_reduce 'negs', 0b00101, 0b000111
add_artihm_op_reduce 'negsw', 0b00101, 0b000110
# nop is an alias over mov null, 0 (mov - [:b, :s12, :flags15])
addop32 'nop', 0b00100110010010100111000000000000
add_artihm_op_reduce 'norm', 0b00101, 0b000001
add_artihm_op_reduce 'normw', 0b00101, 0b001000
add_artihm_op_reduce 'not', 0b00100, 0b001010
add_artihm_op 'or', 0b00100, 0b000101
# 0b00010bbbssssssssSBBB0aa000111110
addop32 'prefetch', 0b00010000000000000000000000111110, :@bs9, :wb
# 0b00100bbbaa1100000BBBCCCCCC111110
addop32 'prefetch', 0b00100000001100000000000000111110, :@bextcext, :wb22
# 0b00100bbb100011011BBBssssssSSSSSS
addop32 'rcmp', 0b00100000100011011000000000000000, :b, :s12
# 0b00100bbb110011011BBBCCCCCC0QQQQQ
addop32 'rcmp', 0b00100000110011011000000000000000, :bext, :cext, :ccond
# 0b00100bbb110011011BBBuuuuuu1QQQQQ
addop32 'rcmp', 0b00100000110011011000000000100000, :b, :u6, :ccond
add_artihm_op_reduce 'rlc', 0b00100, 0b001011
add_artihm_op_reduce 'rnd16', 0b00101, 0b000011
add_artihm_op_reduce 'ror', 0b00100, 0b000011
add_artihm_op 'ror', 0b00101, 0b000011, :extended
add_artihm_op_reduce 'rrc', 0b00100, 0b000100
add_artihm_op 'rsub', 0b00100, 0b001110
addop32 'rtie', 0b00100100011011110000000000111111, :setip, :stopexec
add_artihm_op_reduce 'sat16', 0b00101, 0b000010
add_artihm_op 'sbc', 0b00100, 0b000011
add_artihm_op_reduce 'sexb', 0b00100, 0b000101
add_artihm_op_reduce 'sexbw', 0b00100, 0b000110
# 0b00100001011011110000uuuuuu111111
addop32 'sleep', 0b00100001011011110000000000111111, :u6
# 0b00100bbb001010110BBBCCCCCCRRRRRR
addop32 'sr', 0b00100000001010110000000000000000, :bext, :@cext
# 0b00100110101010110111CCCCCCRRRRRR
addop32 'sr', 0b00100000101010110000000000000000, :bext, :auxs12
# WTF: unknown encoding
addop32 'sr', 0b00100000011010110000000000000000, :bext, :auxs12
# 0b00011bbbssssssssSBBBCCCCCCDaaZZR
addop32 'st', 0b00011000000000000000000000000000, :cext, :@bs9, :sz1, :wb3, :cache5
add_artihm_op 'sub', 0b00100, 0b000010
add_artihm_op 'sub1', 0b00100, 0b010111
add_artihm_op 'sub2', 0b00100, 0b011000
add_artihm_op 'sub3', 0b00100, 0b011001
# WTF: same encoding as xor instructions
#add_artihm_op 'subs', 0b00100, 0b000111
add_artihm_op 'subsdw', 0b00101, 0b101001, :extended
add_artihm_op_reduce 'swap', 0b00101, 0b000000
addop32 'swi', 0b00100010011011110000000000111111, :setip, :stopexec
addop32 'sync', 0b00100011011011110000000000111111
# 0b00100bbb100010111BBBssssssSSSSSS
addop32 'tst', 0b00100000100010111000000000000000, :b, :s12
# 0b00100bbb110010111BBBCCCCCC0QQQQQ
addop32 'tst', 0b00100000110010111000000000000000, :bext, :cext, :ccond
# 0b00100bbb110010111BBBuuuuuu1QQQQQ
addop32 'tst', 0b00100000110010111000000000100000, :b, :u6, :ccond
add_artihm_op 'xor', 0b00100, 0b000111
end
# ARCompact 16-bit instructions
def init_arc_compact16
addop16 'abs_s', 0x7811, :cb, :cc
addop16 'add_s', 0x6018, :ca, :cb, :cc
addop16 'add_s', 0x7000, :cb, :cb2, :ch
addop16 'add_s', 0x6800, :cc, :cb, :cu3
addop16 'add_s', 0xe000, :cb, :cb2, :cu7
# same encoding as add_s b,b,h
#addop16 'add_s', 0x70c7, :cb, :cb2, :climm
addop16 'add_s', 0xc080, :cb, :sp, :cu5ee
addop16 'add_s', 0xc0a0, :sp, :sp2, :cu5ee
addop16 'add_s', 0xce00, :cr0, :gp, :cs9
addop16 'add1_s', 0x7814, :cb, :cb2, :cc
addop16 'add2_s', 0x7815, :cb, :cb2, :cc
addop16 'add3_s', 0x7816, :cb, :cb2, :cc
addop16 'and_s', 0x7804, :cb, :cb2, :cc
addop16 'asl_s', 0x7818, :cb, :cb2, :cc
addop16 'asl_s', 0x6810, :cc, :cb, :cu3
addop16 'asl_s', 0xb800, :cb, :cb2, :cu5
addop16 'asl_s', 0x781b, :cb, :cc
addop16 'asr_s', 0x781a, :cb, :cb2, :cc
addop16 'asr_s', 0x6818, :cc, :cb, :cu3
addop16 'asr_s', 0xb840, :cb, :cb2, :cu5
addop16 'asr_s', 0x781c, :cb, :cc
addop16 'b_s', 0xf000, :cdisps10, :setip, :stopexec
addop16 'beq_s', 0xf200, :cdisps10, :setip
addop16 'bne_s', 0xf400, :cdisps10, :setip
addop16 'bgt_s', 0xf600, :cdisps7, :setip
addop16 'bge_s', 0xf640, :cdisps7, :setip
addop16 'blt_s', 0xf680, :cdisps7, :setip
addop16 'ble_s', 0xf6c0, :cdisps7, :setip
addop16 'bhi_s', 0xf700, :cdisps7, :setip
addop16 'bhs_s', 0xf740, :cdisps7, :setip
addop16 'blo_s', 0xf780, :cdisps7, :setip
addop16 'bls_s', 0xf7c0, :cdisps7, :setip
addop16 'bclr_s', 0xb8a0, :cb, :cb2, :cu5
addop16 'bic_s', 0x7806, :cb, :cb2, :cc
addop16 'bl_s', 0xf800, :cdisps13, :setip, :saveip, :stopexec
addop16 'bmsk_s', 0xb8c0, :cb, :cb2, :cu5
addop16 'breq_s', 0xe800, :cb, :zero, :cdisps8, :setip
addop16 'brne_s', 0xe880, :cb, :zero, :cdisps8, :setip
addop16 'brk_s', 0x7fff
addop16 'bset_s', 0xb880, :cb, :cb2, :cu5
addop16 'btst_s', 0xb8e0, :cb, :cu5
addop16 'cmp_s', 0x7010, :cb, :ch
addop16 'cmp_s', 0xe080, :cb, :cu7
# encoded over cmp_s b,h
# addop16 'cmp_s', 0x70d7, :cb, :limm
addop16 'extb_s', 0x780f, :cb, :cc
addop16 'extw_s', 0x7810, :cb, :cc
addop16 'j_s', 0x7800, :@cb, :setip, :stopexec
addop16 'j_s.d', 0x7820, :@cb, :setip, :stopexec, :delay_slot
addop16 'j_s', 0x7ee0, :@blink, :setip, :stopexec
addop16 'j_s.d', 0x7fe0, :@blink, :setip, :stopexec, :delay_slot
addop16 'jeq_s', 0x7ce0, :@blink, :setip
addop16 'jne_s', 0x7de0, :@blink, :setip
addop16 'jl_s', 0x7840, :@cb, :setip, :saveip, :stopexec
addop16 'jl_s.d', 0x7860, :@cb, :setip, :saveip, :stopexec, :delay_slot
addop16 'ld_s', 0x6000, :ca, :@cbcc
addop16 'ldb_s', 0x6008, :ca, :@cbcc
addop16 'ldw_s', 0x6010, :ca, :@cbcc
addop16 'ld_s', 0x8000, :cc, :@cbu7
addop16 'ldb_s', 0x8800, :cc, :@cbu5
addop16 'ldw_s', 0x9000, :cc, :@cbu6
addop16 'ldw_s.x', 0x9800, :cc, :@cbu6
addop16 'ld_s', 0xc000, :cb, :@cspu7
addop16 'ldb_s', 0xc020, :cb, :@cspu7
addop16 'ld_s', 0xc800, :cr0, :@gps11
addop16 'ldb_s', 0xca00, :cr0, :@gps9
addop16 'ldw_s', 0xcc00, :cr0, :@gps10
addop16 'ld_s', 0xd000, :cb, :@pclu10
# FIXME: exact same encoding as asl_s instructions
#addop16 'lsl_s', 0x7818, :cb, :cb2, :cc
#addop16 'lsl_s', 0x6810, :cc, :cb, :cu3
#addop16 'lsl_s', 0xb800, :cb, :cb2, :cu5
#addop16 'lsl_s', 0x781d, :cb, :cc
addop16 'lsr_s', 0x7819, :cb, :cb2, :cc
addop16 'lsr_s', 0xb820, :cb, :cb2, :cu5
addop16 'lsr_s', 0x781d, :cb, :cc
addop16 'mov_s', 0x7008, :cb, :ch
# FIXME: same encoding as previous instruction
#addop16 'mov_s', 0x70cf, :cb, :limm
addop16 'mov_s', 0xd800, :cb, :cu8
addop16 'mov_s', 0x7018, :ch, :cb
# TODO seems to overlap with previous instruction
addop16 'mov_s', 0x70df, :zero, :cb
addop16 'mul64_s', 0x780c, :zero, :cb, :cc
addop16 'neg_s', 0x7813, :cb, :cc
addop16 'not_s', 0x7812, :cb, :cc
addop16 'nop_s',0x78e0
addop16 'unimp_s', 0x79e0
addop16 'or_s', 0x7805, :cb, :cb2, :cc
addop16 'pop_s', 0xc0c1, :cb
addop16 'pop_s', 0xc0d1, :blink
addop16 'push_s', 0xc0e1, :cb
addop16 'push_s', 0xc0f1, :blink
addop16 'sexb_s', 0x780d, :cb, :cc
addop16 'sexw_s', 0x780e, :cb, :cc
addop16 'st_s', 0xc040, :cb, :@cspu7
addop16 'stb_s', 0xc060, :cb, :@cspu7
addop16 'st_s', 0xa000, :cc, :@cbu7
addop16 'stb_s', 0xa800, :cc, :@cbu5
addop16 'stw_s', 0xb000, :cc, :@cbu6
addop16 'sub_s', 0x7802, :cb, :cb2, :cc
addop16 'sub_s', 0x6808, :cc, :cb, :cu3
addop16 'sub_s', 0xb860, :cb, :cb2, :cu5
addop16 'sub_s', 0xc1a0, :sp, :sp2, :cu5ee
addop16 'sub_s.ne', 0x78c0, :cb, :c2, :cb3
addop16 'trap_s', 0x781E, :cu6, :setip, :stopexec
addop16 'tst_s', 0x780b, :cb, :cc
addop16 'xor_s', 0x7807, :cb, :cb2, :cc
end
end
end
-14
View File
@@ -1,14 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
class Metasm::ARM < Metasm::CPU
end
require 'metasm/main'
require 'metasm/cpu/arm/parse'
require 'metasm/cpu/arm/encode'
require 'metasm/cpu/arm/decode'
require 'metasm/cpu/arm/render'
require 'metasm/cpu/arm/debug'
-323
View File
@@ -1,323 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/arm/main'
module Metasm
class ARM
private
# ARM MODE
def addop(name, bin, *args)
args << :cond if not args.delete :uncond
suppl = nil
o = Opcode.new name, bin
args.each { |a|
# Should Be One fields
if a == :sbo16 ; o.bin |= 0b1111 << 16 ; next ; end
if a == :sbo12 ; o.bin |= 0b1111 << 12 ; next ; end
if a == :sbo8 ; o.bin |= 0b1111 << 8 ; next ; end
if a == :sbo0 ; o.bin |= 0b1111 << 0 ; next ; end
o.args << a if @valid_args[a]
o.props[a] = true if @valid_props[a]
o.props.update a if a.kind_of?(Hash)
# special args -> multiple fields
suppl ||= { :i8_r => [:i8, :rotate], :rm_is => [:rm, :stype, :shifti],
:rm_rs => [:rm, :stype, :rs], :mem_rn_rm => [:rn, :rm, :rsx, :u],
:mem_rn_i8_12 => [:rn, :i8_12, :u],
:mem_rn_rms => [:rn, :rm, :stype, :shifti, :i],
:mem_rn_i12 => [:rn, :i12, :u]
}[a]
}
args.concat suppl if suppl
args.each { |a| o.fields[a] = [@fields_mask[a], @fields_shift[a]] if @fields_mask[a] }
@opcode_list << o
end
def addop_data_s(name, op, a1, a2, *h)
addop name, op | (1 << 25), a1, a2, :i8_r, :rotate, *h
addop name, op, a1, a2, :rm_is, *h
addop name, op | (1 << 4), a1, a2, :rm_rs, *h
end
def addop_data(name, op, a1, a2)
addop_data_s name, op << 21, a1, a2
addop_data_s name+'s', (op << 21) | (1 << 20), a1, a2, :cond_name_off => name.length
end
def addop_load_puw(name, op, *a)
addop name, op, {:baseincr => :post}, :rd, :u, *a
addop name, op | (1 << 24), :rd, :u, *a
addop name, op | (1 << 24) | (1 << 21), {:baseincr => :pre}, :rd, :u, *a
end
def addop_load_lsh_o(name, op)
addop_load_puw name, op, :rsz, :mem_rn_rm, {:cond_name_off => 3}
addop_load_puw name, op | (1 << 22), :mem_rn_i8_12, {:cond_name_off => 3}
end
def addop_load_lsh
op = 9 << 4
addop_load_lsh_o 'strh', op | (1 << 5)
addop_load_lsh_o 'ldrd', op | (1 << 6)
addop_load_lsh_o 'strd', op | (1 << 6) | (1 << 5)
addop_load_lsh_o 'ldrh', op | (1 << 20) | (1 << 5)
addop_load_lsh_o 'ldrsb', op | (1 << 20) | (1 << 6)
addop_load_lsh_o 'ldrsh', op | (1 << 20) | (1 << 6) | (1 << 5)
end
def addop_load_puwt(name, op, *a)
addop_load_puw name, op, *a
addop name+'t', op | (1 << 21), {:baseincr => :post, :cond_name_off => name.length}, :rd, :u, *a
end
def addop_load_o(name, op, *a)
addop_load_puwt name, op, :mem_rn_i12, *a
addop_load_puwt name, op | (1 << 25), :mem_rn_rms, *a
end
def addop_load(name, op)
addop_load_o name, op
addop_load_o name+'b', op | (1 << 22), :cond_name_off => name.length
end
def addop_ldm_go(name, op, *a)
addop name, op, :rn, :reglist, {:cond_name_off => 3}, *a
end
def addop_ldm_w(name, op, *a)
addop_ldm_go name, op, *a # base reg untouched
addop_ldm_go name, op | (1 << 21), {:baseincr => :post}, *a # base updated
end
def addop_ldm_s(name, op)
addop_ldm_w name, op # transfer regs
addop_ldm_w name, op | (1 << 22), :usermoderegs # transfer usermode regs
end
def addop_ldm_p(name, op)
addop_ldm_s name+'a', op # target memory included
addop_ldm_s name+'b', op | (1 << 24) # target memory excluded, transfer starts at next addr
end
def addop_ldm_u(name, op)
addop_ldm_p name+'d', op # transfer made downward
addop_ldm_p name+'i', op | (1 << 23) # transfer made upward
end
def addop_ldm(name, op)
addop_ldm_u name, op
end
# ARMv6 instruction set, aka arm7/arm9
def init_arm_v6
@opcode_list = []
[:baseincr, :cond, :cond_name_off, :usermoderegs, :tothumb, :tojazelle
].each { |p| @valid_props[p] = true }
[:rn, :rd, :rm, :crn, :crd, :crm, :cpn, :reglist, :i24, :rm_rs, :rm_is,
:i8_r, :mem_rn_rm, :mem_rn_i8_12, :mem_rn_rms, :mem_rn_i12
].each { |p| @valid_args[p] = true }
@fields_mask.update :rn => 0xf, :rd => 0xf, :rs => 0xf, :rm => 0xf,
:crn => 0xf, :crd => 0xf, :crm => 0xf, :cpn => 0xf,
:rnx => 0xf, :rdx => 0xf, :rsx => 0xf,
:shifti => 0x1f, :stype => 3, :rotate => 0xf, :reglist => 0xffff,
:i8 => 0xff, :i12 => 0xfff, :i24 => 0xff_ffff, :i8_12 => 0xf0f,
:u => 1, :mask => 0xf, :sbo => 0xf, :cond => 0xf
@fields_shift.update :rn => 16, :rd => 12, :rs => 8, :rm => 0,
:crn => 16, :crd => 12, :crm => 0, :cpn => 8,
:rnx => 16, :rdx => 12, :rsx => 8,
:shifti => 7, :stype => 5, :rotate => 8, :reglist => 0,
:i8 => 0, :i12 => 0, :i24 => 0, :i8_12 => 0,
:u => 23, :mask => 16, :sbo => 12, :cond => 28
addop_data 'and', 0, :rd, :rn
addop_data 'eor', 1, :rd, :rn
addop_data 'xor', 1, :rd, :rn
addop_data 'sub', 2, :rd, :rn
addop_data 'rsb', 3, :rd, :rn
addop_data 'add', 4, :rd, :rn
addop_data 'adc', 5, :rd, :rn
addop_data 'sbc', 6, :rd, :rn
addop_data 'rsc', 7, :rd, :rn
addop_data_s 'tst', (8 << 21) | (1 << 20), :rdx, :rn
addop_data_s 'teq', (9 << 21) | (1 << 20), :rdx, :rn
addop_data_s 'cmp', (10 << 21) | (1 << 20), :rdx, :rn
addop_data_s 'cmn', (11 << 21) | (1 << 20), :rdx, :rn
addop_data 'orr', 12, :rd, :rn
addop_data 'or', 12, :rd, :rn
addop_data 'mov', 13, :rd, :rnx
addop_data 'bic', 14, :rd, :rn
addop_data 'mvn', 15, :rd, :rnx
addop 'b', 0b1010 << 24, :setip, :stopexec, :i24
addop 'bl', 0b1011 << 24, :setip, :stopexec, :i24, :saveip
addop 'bkpt', (0b00010010 << 20) | (0b0111 << 4) # other fields are available&unused, also cnd != AL is undef
addop 'blx', 0b1111101 << 25, :setip, :stopexec, :saveip, :tothumb, :h, :uncond, :i24
addop 'blx', (0b00010010 << 20) | (0b0011 << 4), :setip, :stopexec, :saveip, :tothumb, :rm, :sbo16, :sbo12, :sbo8
addop 'bx', (0b00010010 << 20) | (0b0001 << 4), :setip, :stopexec, :rm, :sbo16, :sbo12, :sbo8
addop 'bxj', (0b00010010 << 20) | (0b0010 << 4), :setip, :stopexec, :rm, :tojazelle, :sbo16, :sbo12, :sbo8
addop_load 'str', (1 << 26)
addop_load 'ldr', (1 << 26) | (1 << 20)
addop_load_lsh
addop_ldm 'stm', (1 << 27)
addop_ldm 'ldm', (1 << 27) | (1 << 20)
# TODO aliases (http://www.davespace.co.uk/arm/introduction-to-arm/stack.html)
# fd = full descending stmfd/ldmfd = stmdb/ldmia
# ed = empty descending stmed/ldmed = stmda/ldmib
# fa = full ascending stmfa/ldmfa = stmib/ldmda
# ea = empty ascending stmea/ldmea = stmia/ldmdb
# TODO mrs, [qus]add/sub*
addop 'clz', (0b00010110 << 20) | (0b0001 << 4), :rd, :rm, :sbo16, :sbo8
addop 'ldrex', (0b00011001 << 20) | (0b1001 << 4), :rd, :rn, :sbo8, :sbo0
addop 'strex', (0b00011000 << 20) | (0b1001 << 4), :rd, :rm, :rn, :sbo8
addop 'rev', (0b01101011 << 20) | (0b0011 << 4), :rd, :rm, :sbo16, :sbo8
addop 'rev16', (0b01101011 << 20) | (0b1011 << 4), :rd, :rm, :sbo16, :sbo8
addop 'revsh', (0b01101111 << 20) | (0b1011 << 4), :rd, :rm, :sbo16, :sbo8
addop 'sel', (0b01101000 << 20) | (0b1011 << 4), :rd, :rn, :rm, :sbo8
end
# THUMB2 MODE
def addop_t(name, bin, *args)
o = Opcode.new name, bin
args.each { |a|
o.args << a if @valid_args[a]
o.props[a] = true if @valid_props[a]
o.props.update a if a.kind_of?(Hash)
}
args.each { |a| o.fields[a] = [@fields_mask[a], @fields_shift[a]] if @fields_mask[a] }
@opcode_list_t << o
end
def init_arm_thumb2
@opcode_list_t = []
@valid_props_t = {}
@valid_args_t = {}
@fields_mask_t = {}
@fields_shift_t = {}
[:i16, :i16_3_8, :i16_rd].each { |p| @valid_props_t[p] = true }
[:i5, :rm, :rn, :rd].each { |p| @valid_args_t[p] = true }
@fields_mask_t.update :i5 => 0x1f, :i3 => 7, :i51 => 0x5f,
:rm => 7, :rn => 7, :rd => 7, :rdn => 7, :rdn8 => 7
@fields_shift_t.update :i5 => 6, :i3 => 6, :i51 => 3,
:rm => 6, :rn => 3, :rd => 0, :rdn => 0, :rdn8 => 8
addop_t 'mov', 0b000_00 << 11, :rd, :rm
addop_t 'lsl', 0b000_00 << 11, :rd, :rm, :i5
addop_t 'lsr', 0b000_01 << 11, :rd, :rm, :i5
addop_t 'asr', 0b000_10 << 11, :rd, :rm, :i5
addop_t 'add', 0b000_1100 << 9, :rd, :rn, :rm
addop_t 'add', 0b000_1110 << 9, :rd, :rn, :i3
addop_t 'sub', 0b000_1101 << 9, :rd, :rn, :rm
addop_t 'sub', 0b000_1111 << 9, :rd, :rn, :i3
addop_t 'mov', 0b001_00 << 10, :rdn8, :i8
addop_t 'cmp', 0b001_01 << 10, :rdn8, :i8
addop_t 'add', 0b001_10 << 10, :rdn8, :i8
addop_t 'sub', 0b001_11 << 10, :rdn8, :i8
addop_t 'and', (0b010000 << 10) | ( 0 << 6), :rdn, :rm
addop_t 'eor', (0b010000 << 10) | ( 1 << 6), :rdn, :rm # xor
addop_t 'lsl', (0b010000 << 10) | ( 2 << 6), :rdn, :rm
addop_t 'lsr', (0b010000 << 10) | ( 3 << 6), :rdn, :rm
addop_t 'asr', (0b010000 << 10) | ( 4 << 6), :rdn, :rm
addop_t 'adc', (0b010000 << 10) | ( 5 << 6), :rdn, :rm
addop_t 'sbc', (0b010000 << 10) | ( 6 << 6), :rdn, :rm
addop_t 'ror', (0b010000 << 10) | ( 7 << 6), :rdn, :rm
addop_t 'tst', (0b010000 << 10) | ( 8 << 6), :rdn, :rm
addop_t 'rsb', (0b010000 << 10) | ( 9 << 6), :rdn, :rm
addop_t 'cmp', (0b010000 << 10) | (10 << 6), :rdn, :rm
addop_t 'cmn', (0b010000 << 10) | (11 << 6), :rdn, :rm
addop_t 'orr', (0b010000 << 10) | (12 << 6), :rdn, :rm # or
addop_t 'mul', (0b010000 << 10) | (13 << 6), :rdn, :rm
addop_t 'bic', (0b010000 << 10) | (14 << 6), :rdn, :rm
addop_t 'mvn', (0b010000 << 10) | (15 << 6), :rdn, :rm
addop_t 'add', 0b010001_00 << 8, :rdn, :rm, :dn
addop_t 'cmp', 0b010001_01 << 8, :rdn, :rm, :dn
addop_t 'mov', 0b010001_10 << 8, :rdn, :rm, :dn
addop_t 'bx', 0b010001_110 << 7, :rm
addop_t 'blx', 0b010001_111 << 7, :rm
addop_t 'ldr', 0b01001 << 11, :rd, :pc_i8
addop_t 'str', 0b0101_000 << 9, :rd, :rn, :rm
addop_t 'strh', 0b0101_001 << 9, :rd, :rn, :rm
addop_t 'strb', 0b0101_010 << 9, :rd, :rn, :rm
addop_t 'ldrsb', 0b0101_011 << 9, :rd, :rn, :rm
addop_t 'ldr', 0b0101_100 << 9, :rd, :rn, :rm
addop_t 'ldrh', 0b0101_101 << 9, :rd, :rn, :rm
addop_t 'ldrb', 0b0101_110 << 9, :rd, :rn, :rm
addop_t 'ldrsh', 0b0101_111 << 9, :rd, :rn, :rm
addop_t 'str', 0b01100 << 11, :rd, :rn, :i5
addop_t 'ldr', 0b01101 << 11, :rd, :rn, :i5
addop_t 'strb', 0b01110 << 11, :rd, :rn, :i5
addop_t 'ldrb', 0b01111 << 11, :rd, :rn, :i5
addop_t 'strh', 0b10000 << 11, :rd, :rn, :i5
addop_t 'ldrh', 0b10001 << 11, :rd, :rn, :i5
addop_t 'str', 0b10010 << 11, :rd, :sp_i8
addop_t 'ldr', 0b10011 << 11, :rd, :sp_i8
addop_t 'adr', 0b10100 << 11, :rd, :pc, :i8
addop_t 'add', 0b10101 << 11, :rd, :sp, :i8
# 0b1011 misc
addop_t 'add', 0b1011_0000_0 << 7, :sp, :i7
addop_t 'sub', 0b1011_0000_1 << 7, :sp, :i7
addop_t 'sxth', 0b1011_0010_00 << 6, :rd, :rn
addop_t 'sxtb', 0b1011_0010_01 << 6, :rd, :rn
addop_t 'uxth', 0b1011_0010_10 << 6, :rd, :rn
addop_t 'uxtb', 0b1011_0010_11 << 6, :rd, :rn
addop_t 'cbz', 0b1011_0001 << 8, :rd, :i51
addop_t 'cbnz', 0b1011_1001 << 8, :rd, :i51
addop_t 'push', 0b1011_0100 << 8, :rlist
addop_t 'push', 0b1011_0101 << 8, :rlist
addop_t 'pop', 0b1011_1100 << 8, :rlist
addop_t 'pop', 0b1011_1101 << 8, :rlist
#addop_t 'unpredictable', 0b1011_0110_0100_0000, :i4
addop_t 'setendle', 0b1011_0110_0101_0000
addop_t 'setendbe', 0b1011_0110_0101_1000
addop_t 'cps', 0b1011_0110_0110_0000
#addop_t 'unpredictable', 0b1011_0110_0110_1000, :msk_0001_0111
addop_t 'rev', 0b1011_1010_00 << 6, :rd, :rn
addop_t 'rev16', 0b1011_1010_01 << 6, :rd, :rn
addop_t 'revsh', 0b1011_1010_11 << 6, :rd, :rn
addop_t 'bkpt', 0b1011_1110 << 8, :i8
addop_t 'it', 0b1011_1111 << 8, :itcond, :itmsk
addop_t 'nop', 0b1011_1111_0000_0000
addop_t 'yield', 0b1011_1111_0000_0001
addop_t 'wfe', 0b1011_1111_0000_0010
addop_t 'wfi', 0b1011_1111_0000_0011
addop_t 'sev', 0b1011_1111_0000_0100
addop_t 'nop', 0b1011_1111_0000_0000, :i4
addop_t 'stmia', 0b11000 << 11, :rn, :rlist # stmea
addop_t 'ldmia', 0b11001 << 11, :rn, :rlist # ldmfd
addop_t 'undef', 0b1101_1110 << 8, :i8
addop_t 'svc', 0b1101_1111 << 8, :i8
addop_t 'b', 0b1101 << 12, :cond, :i8
addop_t 'b', 0b11100 << 11, :i11
# thumb-32
end
def init_arm_v6_thumb2
init_arm_v6
init_arm_thumb2
end
alias init_latest init_arm_v6_thumb2
end
end
-142
View File
@@ -1,142 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/bpf/opcodes'
require 'metasm/decode'
module Metasm
class BPF
def build_bin_lookaside
opcode_list.inject({}) { |h, op| h.update op.bin => op }
end
# tries to find the opcode encoded at edata.ptr
def decode_findopcode(edata)
return if edata.ptr > edata.data.length-8
di = DecodedInstruction.new self
code = edata.data[edata.ptr, 2].unpack('v')[0]
return di if di.opcode = @bin_lookaside[code]
end
def decode_instr_op(edata, di)
op = di.opcode
di.instruction.opname = op.name
di.bin_length = 8
code, jt, jf, k = edata.read(8).unpack('vCCV')
op.args.each { |a|
di.instruction.args << case a
when :k; Expression[k]
when :x; Reg.new(:x)
when :a; Reg.new(:a)
when :len; Reg.new(:len)
when :p_k; PktRef.new(nil, Expression[k], op.props[:msz])
when :p_xk; PktRef.new(Reg.new(:x), Expression[k], op.props[:msz])
when :m_k; MemRef.new(nil, Expression[4*k], 4)
when :jt; Expression[jt]
when :jf; Expression[jf]
else raise "unhandled arg #{a}"
end
}
# je a, x, 0, 12 -> jne a, x, 12
# je a, x, 12, 0 -> je a, x, 12
if op.args[2] == :jt and di.instruction.args[2] == Expression[0]
di.opcode = op.dup
di.opcode.props.delete :stopexec
di.instruction.opname = { 'jg' => 'jle', 'jge' => 'jl', 'je' => 'jne', 'jtest' => 'jntest' }[di.instruction.opname]
di.instruction.args.delete_at(2)
elsif op.args[3] == :jf and di.instruction.args[3] == Expression[0]
di.opcode = op.dup
di.opcode.props.delete :stopexec
di.instruction.args.delete_at(3)
end
di
end
def decode_instr_interpret(di, addr)
if di.opcode.props[:setip]
delta = di.instruction.args[-1].reduce + 1
arg = Expression[addr, :+, 8*delta].reduce
di.instruction.args[-1] = Expression[arg]
if di.instruction.args.length == 4
delta = di.instruction.args[2].reduce + 1
arg = Expression[addr, :+, 8*delta].reduce
di.instruction.args[2] = Expression[arg]
end
end
di
end
# hash opcode_name => lambda { |dasm, di, *symbolic_args| instr_binding }
def backtrace_binding
@backtrace_binding ||= init_backtrace_binding
end
def backtrace_binding=(b) @backtrace_binding = b end
# populate the @backtrace_binding hash with default values
def init_backtrace_binding
@backtrace_binding ||= {}
opcode_list.map { |ol| ol.basename }.uniq.sort.each { |op|
binding = case op
when 'mov'; lambda { |di, a0, a1| { a0 => Expression[a1] } }
when 'add'; lambda { |di, a0, a1| { a0 => Expression[a0, :+, a1] } }
when 'sub'; lambda { |di, a0, a1| { a0 => Expression[a0, :-, a1] } }
when 'mul'; lambda { |di, a0, a1| { a0 => Expression[a0, :*, a1] } }
when 'div'; lambda { |di, a0, a1| { a0 => Expression[a0, :/, a1] } }
when 'shl'; lambda { |di, a0, a1| { a0 => Expression[a0, :<<, a1] } }
when 'shr'; lambda { |di, a0, a1| { a0 => Expression[a0, :>>, a1] } }
when 'neg'; lambda { |di, a0| { a0 => Expression[:-, a0] } }
when 'msh'; lambda { |di, a0, a1| { a0 => Expression[[a1, :&, 0xf], :<<, 2] } }
when 'jmp', 'jg', 'jge', 'je', 'jtest', 'ret'; lambda { |di, *a| { } }
end
@backtrace_binding[op] ||= binding if binding
}
@backtrace_binding
end
def get_backtrace_binding(di)
a = di.instruction.args.map { |arg|
case arg
when PktRef, MemRef, Reg; arg.symbolic(di)
else arg
end
}
if binding = backtrace_binding[di.opcode.name]
binding[di, *a]
else
puts "unhandled instruction to backtrace: #{di}" if $VERBOSE
{:incomplete_binding => Expression[1]}
end
end
def get_xrefs_x(dasm, di)
return [] if not di.opcode.props[:setip]
if di.instruction.args.length == 4
di.instruction.args[-2, 2]
else
di.instruction.args[-1, 1]
end
end
# updates an instruction's argument replacing an expression with another (eg label renamed)
def replace_instr_arg_immediate(i, old, new)
i.args.map! { |a|
case a
when Expression; a == old ? new : Expression[a.bind(old => new).reduce]
else a
end
}
end
end
end
-60
View File
@@ -1,60 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
module Metasm
class BPF < CPU
class Reg
attr_accessor :v
def initialize(v)
@v = v
end
def symbolic(orig=nil) ; @v ; end
end
class MemRef
attr_accessor :base, :offset, :msz
def memtype
:mem
end
def initialize(base, offset, msz)
@base = base
@offset = offset
@msz = msz
end
def symbolic(orig)
p = Expression[memtype]
p = Expression[p, :+, @base.symbolic] if base
p = Expression[p, :+, @offset] if offset
Indirection[p, @msz, orig]
end
end
class PktRef < MemRef
def memtype
:pkt
end
end
def initialize(family = :latest)
super()
@endianness = :big
@size = 32
@family = family
end
def init_opcode_list
send("init_#@family")
@opcode_list
end
end
end
-81
View File
@@ -1,81 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/bpf/main'
module Metasm
class BPF
def addop(name, bin, *args)
o = Opcode.new name, bin
args.each { |a|
o.args << a if @valid_args[a]
o.props.update a if a.kind_of?(::Hash)
}
@opcode_list << o
end
def addop_ldx(bin, src)
addop 'mov', bin | 0x00, :a, src
addop 'mov', bin | 0x01, :x, src
end
def addop_ldsz(bin, src)
addop 'mov', bin | 0x00, :a, src, :msz => 4
addop 'mov', bin | 0x08, :a, src, :msz => 2
addop 'mov', bin | 0x10, :a, src, :msz => 1
end
def addop_alu(name, bin)
addop name, bin | 0x04, :a, :k
addop name, bin | 0x0C, :a, :x
end
def addop_j(name, bin)
addop name, bin | 0x05 | 0x00, :a, :k, :jt, :jf, :setip => true, :stopexec => true
addop name, bin | 0x05 | 0x08, :a, :x, :jt, :jf, :setip => true, :stopexec => true
end
def init_bpf
@opcode_list = []
[:a, :k, :x, :len, :m_k, :p_k, :p_xk, :jt, :jf].each { |a| @valid_args[a] = true }
# LD/ST
addop_ldx 0x00, :k
addop_ldsz 0x20, :p_k
addop_ldsz 0x40, :p_xk
addop_ldx 0x60, :m_k
addop_ldx 0x80, :len
addop 'msh', 0xB1, :x, :p_k, :msz => 1
addop 'mov', 0x02, :m_k, :a
addop 'mov', 0x03, :m_k, :x
# ALU
addop_alu 'add', 0x00
addop_alu 'sub', 0x10
addop_alu 'mul', 0x20
addop_alu 'div', 0x30
addop_alu 'or', 0x40
addop_alu 'and', 0x50
addop_alu 'shl', 0x60
addop_alu 'shr', 0x70
addop 'neg', 0x84, :a
# JMP
addop 'jmp', 0x05, :k, :setip => true, :stopexec => true
addop_j 'je', 0x10
addop_j 'jg', 0x20
addop_j 'jge', 0x30
addop_j 'jtest',0x40
addop 'ret', 0x06, :k, :stopexec => true
addop 'ret', 0x16, :a, :stopexec => true
addop 'mov', 0x07, :x, :a
addop 'mov', 0x87, :a, :x
end
alias init_latest init_bpf
end
end
-41
View File
@@ -1,41 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/bpf/opcodes'
require 'metasm/render'
module Metasm
class BPF
class Reg
include Renderable
def render ; [@v.to_s] end
end
class MemRef
include Renderable
def render
r = []
r << memtype
r << [nil, ' byte ', ' word ', nil, ' dword '][@msz]
r << '['
r << @base if @base
r << '+' if @base and @offset
r << @offset if @offset
r << ']'
end
end
def render_instruction(i)
r = []
r << i.opname
if not i.args.empty?
r << ' '
i.args.each { |a_| r << a_ << ', ' }
r.pop
end
r
end
end
end
-253
View File
@@ -1,253 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/cy16/opcodes'
require 'metasm/decode'
module Metasm
class CY16
def build_opcode_bin_mask(op)
# bit = 0 if can be mutated by an field value, 1 if fixed by opcode
op.bin_mask = 0
op.fields.each { |f, off|
op.bin_mask |= (@fields_mask[f] << off)
}
op.bin_mask ^= 0xffff
end
def build_bin_lookaside
# sets up a hash byte value => list of opcodes that may match
# opcode.bin_mask is built here
lookaside = Array.new(256) { [] }
opcode_list.each { |op|
build_opcode_bin_mask op
b = (op.bin >> 8) & 0xff
msk = (op.bin_mask >> 8) & 0xff
for i in b..(b | (255^msk))
lookaside[i] << op if i & msk == b & msk
end
}
lookaside
end
def decode_findopcode(edata)
di = DecodedInstruction.new self
return if edata.ptr+2 > edata.length
bin = edata.decode_imm(:u16, @endianness)
edata.ptr -= 2
return di if di.opcode = @bin_lookaside[(bin >> 8) & 0xff].find { |op|
bin & op.bin_mask == op.bin & op.bin_mask
}
end
def decode_instr_op_r(val, edata)
bw = ((val & 0b1000) > 0 ? 1 : 2)
case val & 0b11_0000
when 0b00_0000
Reg.new(val)
when 0b01_0000
if val == 0b01_1111
Expression[edata.decode_imm(:u16, @endianness)]
else
Memref.new(Reg.new(8+(val&7)), nil, bw)
end
when 0b10_0000
if val & 7 == 7
Memref.new(nil, edata.decode_imm(:u16, @endianness), bw)
else
Memref.new(Reg.new(8+(val&7)), nil, bw, true)
end
when 0b11_0000
Memref.new(Reg.new(8+(val&7)), edata.decode_imm(:u16, @endianness), bw)
end
end
def decode_instr_op(edata, di)
before_ptr = edata.ptr
op = di.opcode
di.instruction.opname = op.name
bin = edata.decode_imm(:u16, @endianness)
field_val = lambda { |f|
if off = op.fields[f]
(bin >> off) & @fields_mask[f]
end
}
op.args.each { |a|
di.instruction.args << case a
when :rs, :rd; decode_instr_op_r(field_val[a], edata)
when :o7; Expression[2*Expression.make_signed(field_val[a], 7)]
when :x7; Expression[field_val[a]]
when :u3; Expression[field_val[a]+1]
else raise SyntaxError, "Internal error: invalid argument #{a} in #{op.name}"
end
}
di.instruction.args.reverse!
di.bin_length += edata.ptr - before_ptr
di
rescue InvalidRD
end
def decode_instr_interpret(di, addr)
if di.opcode.props[:setip] and di.opcode.args.last == :o7
delta = di.instruction.args.last.reduce
arg = Expression[[addr, :+, di.bin_length], :+, delta].reduce
di.instruction.args[-1] = Expression[arg]
end
di
end
# hash opcode_name => lambda { |dasm, di, *symbolic_args| instr_binding }
def backtrace_binding
@backtrace_binding ||= init_backtrace_binding
end
def backtrace_binding=(b) @backtrace_binding = b end
# populate the @backtrace_binding hash with default values
def init_backtrace_binding
@backtrace_binding ||= {}
mask = 0xffff
opcode_list.map { |ol| ol.basename }.uniq.sort.each { |op|
binding = case op
when 'mov'; lambda { |di, a0, a1| { a0 => Expression[a1] } }
when 'add', 'adc', 'sub', 'sbc', 'and', 'xor', 'or', 'addi', 'subi'
lambda { |di, a0, a1|
e_op = { 'add' => :+, 'adc' => :+, 'sub' => :-, 'sbc' => :-, 'and' => :&,
'xor' => :^, 'or' => :|, 'addi' => :+, 'subi' => :- }[op]
ret = Expression[a0, e_op, a1]
ret = Expression[ret, e_op, :flag_c] if op == 'adc' or op == 'sbb'
# optimises eax ^ eax => 0
# avoid hiding memory accesses (to not hide possible fault)
ret = Expression[ret.reduce] if not a0.kind_of? Indirection
{ a0 => ret }
}
when 'cmp', 'test'; lambda { |di, *a| {} }
when 'not'; lambda { |di, a0| { a0 => Expression[a0, :^, mask] } }
when 'call'
lambda { |di, a0| { :sp => Expression[:sp, :-, 2],
Indirection[:sp, 2, di.address] => Expression[di.next_addr] }
}
when 'ret'; lambda { |di, *a| { :sp => Expression[:sp, :+, 2] } }
# TODO callCC, retCC ...
when /^j/; lambda { |di, *a| {} }
end
# TODO flags ?
@backtrace_binding[op] ||= binding if binding
}
@backtrace_binding
end
def get_backtrace_binding(di)
a = di.instruction.args.map { |arg|
case arg
when Memref, Reg; arg.symbolic(di)
else arg
end
}
if binding = backtrace_binding[di.opcode.basename]
bd = {}
di.instruction.args.each { |aa| bd[aa.base.symbolic] = Expression[aa.base.symbolic, :+, aa.sz] if aa.kind_of?(Memref) and aa.autoincr }
bd.update binding[di, *a]
else
puts "unhandled instruction to backtrace: #{di}" if $VERBOSE
# assume nothing except the 1st arg is modified
case a[0]
when Indirection, Symbol; { a[0] => Expression::Unknown }
when Expression; (x = a[0].externals.first) ? { x => Expression::Unknown } : {}
else {}
end.update(:incomplete_binding => Expression[1])
end
end
# patch a forward binding from the backtrace binding
def fix_fwdemu_binding(di, fbd)
case di.opcode.name
when 'call'; fbd[Indirection[[:sp, :-, 2], 2]] = fbd.delete(Indirection[:sp, 2])
end
fbd
end
def get_xrefs_x(dasm, di)
return [] if not di.opcode.props[:setip]
return [Indirection[:sp, 2, di.address]] if di.opcode.name =~ /^r/
case tg = di.instruction.args.first
when Memref; [Expression[tg.symbolic(di)]]
when Reg; [Expression[tg.symbolic(di)]]
when Expression, ::Integer; [Expression[tg]]
else
puts "unhandled setip at #{di.address} #{di.instruction}" if $DEBUG
[]
end
end
# checks if expr is a valid return expression matching the :saveip instruction
def backtrace_is_function_return(expr, di=nil)
expr = Expression[expr].reduce_rec
expr.kind_of?(Indirection) and expr.len == 2 and expr.target == Expression[:sp]
end
# updates the function backtrace_binding
# if the function is big and no specific register is given, do nothing (the binding will be lazily updated later, on demand)
def backtrace_update_function_binding(dasm, faddr, f, retaddrlist, *wantregs)
b = f.backtrace_binding
bt_val = lambda { |r|
next if not retaddrlist
b[r] = Expression::Unknown
bt = []
retaddrlist.each { |retaddr|
bt |= dasm.backtrace(Expression[r], retaddr, :include_start => true,
:snapshot_addr => faddr, :origin => retaddr)
}
if bt.length != 1
b[r] = Expression::Unknown
else
b[r] = bt.first
end
}
if not wantregs.empty?
wantregs.each(&bt_val)
else
bt_val[:sp]
end
b
end
# returns true if the expression is an address on the stack
def backtrace_is_stack_address(expr)
Expression[expr].expr_externals.include?(:sp)
end
# updates an instruction's argument replacing an expression with another (eg label renamed)
def replace_instr_arg_immediate(i, old, new)
i.args.map! { |a|
case a
when Expression; a == old ? new : Expression[a.bind(old => new).reduce]
when Memref
a.offset = (a.offset == old ? new : Expression[a.offset.bind(old => new).reduce]) if a.offset
a
else a
end
}
end
end
end
-63
View File
@@ -1,63 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
module Metasm
class CY16 < CPU
class Reg
class << self
attr_accessor :s_to_i, :i_to_s
end
@i_to_s = (0..14).inject({}) { |h, i| h.update i => "r#{i}" }
@i_to_s[15] = 'sp'
@s_to_i = @i_to_s.invert
attr_accessor :i
def initialize(i)
@i = i
end
def symbolic(orig=nil) ; to_s.to_sym ; end
def self.from_str(s)
raise "Bad name #{s.inspect}" if not x = @s_to_i[s]
new(x)
end
end
class Memref
attr_accessor :base, :offset, :sz, :autoincr
def initialize(base, offset, sz=nil, autoincr=nil)
@base = base
offset = Expression[offset] if offset
@offset = offset
@sz = sz
@autoincr = autoincr
end
def symbolic(orig)
p = nil
p = Expression[p, :+, @base.symbolic] if base
p = Expression[p, :+, @offset] if offset
Indirection[p.reduce, @sz, orig]
end
end
def initialize(family = :latest)
super()
@endianness = :little
@size = 16
@family = family
end
def init_opcode_list
send("init_#@family")
@opcode_list
end
end
end
-78
View File
@@ -1,78 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/cy16/main'
module Metasm
class CY16
def addop(name, bin, *args)
o = Opcode.new name, bin
args.each { |a|
o.args << a if @fields_mask[a] or @valid_args[a]
o.props[a] = true if @valid_props[a]
o.fields[a] = @fields_shift[a] if @fields_mask[a]
raise "wtf #{a.inspect}" unless @valid_args[a] or @valid_props[a] or @fields_mask[a]
}
@opcode_list << o
end
def addop_macrocc(name, bin, *args)
%w[z nz b ae s ns o no a be g ge l le].each_with_index { |cc, i|
dbin = bin
dbin |= i << 8
addop name + cc, dbin, *args
}
end
def init_cy16
@opcode_list = []
@valid_args.update [:rs, :rd, :o7
].inject({}) { |h, v| h.update v => true }
@fields_mask.update :rs => 0x3f, :rd => 0x3f, :o7 => 0x7f, :x7 => 0x7f, :u3 => 7
@fields_shift.update :rs => 6, :rd => 0, :o7 => 0, :x7 => 0, :u3 => 6
addop 'mov', 0<<12, :rs, :rd
addop 'add', 1<<12, :rs, :rd
addop 'adc', 2<<12, :rs, :rd
addop 'addc',2<<12, :rs, :rd
addop 'sub', 3<<12, :rs, :rd
addop 'sbb', 4<<12, :rs, :rd
addop 'subb',4<<12, :rs, :rd
addop 'cmp', 5<<12, :rs, :rd
addop 'and', 6<<12, :rs, :rd
addop 'test',7<<12, :rs, :rd
addop 'or', 8<<12, :rs, :rd
addop 'xor', 9<<12, :rs, :rd
addop_macrocc 'int', (10<<12), :x7
addop 'int', (10<<12) | (15<<8), :x7
addop_macrocc 'c', (10<<12) | (1<<7), :setip, :saveip, :rd
addop 'call',(10<<12) | (15<<8) | (1<<7), :setip, :stopexec, :saveip, :rd
addop_macrocc 'r', (12<<12) | (1<<7) | 0b010111, :setip # must come before absolute jmp
addop 'ret', (12<<12) | (15<<8) | (1<<7) | 0b010111, :setip, :stopexec
addop_macrocc 'j', (12<<12), :setip, :o7 # relative
addop 'jmp', (12<<12) | (15<<8), :setip, :stopexec, :o7 # relative
addop_macrocc 'j', (12<<12) | (1<<7), :setip, :rd # absolute
addop 'jmp', (12<<12) | (15<<8) | (1<<7), :setip, :stopexec, :rd # absolute
addop 'shr', (13<<12) | (0<<9), :u3, :rd
addop 'shl', (13<<12) | (1<<9), :u3, :rd
addop 'ror', (13<<12) | (2<<9), :u3, :rd
addop 'rol', (13<<12) | (3<<9), :u3, :rd
addop 'addi',(13<<12) | (4<<9), :u3, :rd
addop 'subi',(13<<12) | (5<<9), :u3, :rd
addop 'not', (13<<12) | (7<<9) | (0<<6), :rd
addop 'neg', (13<<12) | (7<<9) | (1<<6), :rd
addop 'cbw', (13<<12) | (7<<9) | (4<<6), :rd
addop 'sti', (13<<12) | (7<<9) | (7<<6) | 0
addop 'cli', (13<<12) | (7<<9) | (7<<6) | 1
addop 'stc', (13<<12) | (7<<9) | (7<<6) | 2
addop 'clc', (13<<12) | (7<<9) | (7<<6) | 3
end
alias init_latest init_cy16
end
end
-41
View File
@@ -1,41 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/cy16/opcodes'
require 'metasm/render'
module Metasm
class CY16
class Reg
include Renderable
def render ; [self.class.i_to_s[@i]] end
end
class Memref
include Renderable
def render
r = []
r << (@sz == 1 ? 'byte ptr ' : 'word ptr ')
r << '['
r << @base if @base
r << '++' if @autoincr
r << ' + ' if @base and @offset
r << @offset if @offset
r << ']'
end
end
def render_instruction(i)
r = []
r << i.opname
if not i.args.empty?
r << ' '
i.args.each { |a_| r << a_ << ', ' }
r.pop
end
r
end
end
end
-17
View File
@@ -1,17 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
# fix autorequire warning
class Metasm::Ia32 < Metasm::CPU
end
require 'metasm/main'
require 'metasm/cpu/ia32/parse'
require 'metasm/cpu/ia32/encode'
require 'metasm/cpu/ia32/decode'
require 'metasm/cpu/ia32/render'
require 'metasm/cpu/ia32/compile_c'
require 'metasm/cpu/ia32/decompile'
require 'metasm/cpu/ia32/debug'
-1424
View File
@@ -1,1424 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/ia32/main'
module Metasm
class Ia32
def init_cpu_constants
@opcode_list ||= []
@fields_mask.update :w => 1, :s => 1, :d => 1, :modrm => 0xC7,
:reg => 7, :eeec => 7, :eeed => 7, :eeet => 7, :seg2 => 3, :seg3 => 7,
:regfp => 7, :regmmx => 7, :regxmm => 7, :regymm => 7,
:vex_r => 1, :vex_b => 1, :vex_x => 1, :vex_w => 1,
:vex_vvvv => 0xF
@fields_mask[:seg2A] = @fields_mask[:seg2]
@fields_mask[:seg3A] = @fields_mask[:seg3]
[:i, :i8, :u8, :u16, :reg, :seg2, :seg2A,
:seg3, :seg3A, :eeec, :eeed, :eeet, :modrm, :mrm_imm,
:farptr, :imm_val1, :imm_val3, :reg_cl, :reg_eax,
:reg_dx, :regfp, :regfp0, :modrmmmx, :regmmx,
:modrmxmm, :regxmm, :modrmymm, :regymm,
:vexvxmm, :vexvymm, :vexvreg, :i4xmm, :i4ymm
].each { |a| @valid_args[a] = true }
[:strop, :stropz, :opsz, :adsz, :argsz, :setip,
:stopexec, :saveip, :unsigned_imm, :random, :needpfx,
:xmmx, :modrmR, :modrmA, :mrmvex
].each { |a| @valid_props[a] = true }
end
# only most common instructions from the 386 instruction set
# inexhaustive list :
# no aaa, arpl, mov crX, call/jmp/ret far, in/out, bts, xchg...
def init_386_common_only
init_cpu_constants
addop_macro1 'adc', 2
addop_macro1 'add', 0
addop_macro1 'and', 4, :unsigned_imm
addop 'bswap', [0x0F, 0xC8], :reg
addop 'call', [0xE8], nil, :stopexec, :setip, :i, :saveip
addop 'call', [0xFF], 2, :stopexec, :setip, :saveip
addop('cbw', [0x98]) { |o| o.props[:opsz] = 16 }
addop('cwde', [0x98]) { |o| o.props[:opsz] = 32 }
addop('cwd', [0x99]) { |o| o.props[:opsz] = 16 }
addop('cdq', [0x99]) { |o| o.props[:opsz] = 32 }
addop_macro1 'cmp', 7
addop_macrostr 'cmps', [0xA6], :stropz
addop 'dec', [0x48], :reg
addop 'dec', [0xFE], 1, {:w => [0, 0]}
addop 'div', [0xF6], 6, {:w => [0, 0]}
addop 'enter', [0xC8], nil, :u16, :u8
addop 'idiv', [0xF6], 7, {:w => [0, 0]}
addop 'imul', [0xF6], 5, {:w => [0, 0]} # implicit eax, but different semantic from imul eax, ebx (the implicit version updates edx:eax)
addop 'imul', [0x0F, 0xAF], :mrm
addop 'imul', [0x69], :mrm, {:s => [0, 1]}, :i
addop 'inc', [0x40], :reg
addop 'inc', [0xFE], 0, {:w => [0, 0]}
addop 'int', [0xCC], nil, :imm_val3, :stopexec
addop 'int', [0xCD], nil, :u8
addop_macrotttn 'j', [0x70], nil, :setip, :i8
addop_macrotttn('j', [0x70], nil, :setip, :i8) { |o| o.name << '.i8' }
addop_macrotttn 'j', [0x0F, 0x80], nil, :setip, :i
addop_macrotttn('j', [0x0F, 0x80], nil, :setip, :i) { |o| o.name << '.i' }
addop 'jmp', [0xE9], nil, {:s => [0, 1]}, :setip, :i, :stopexec
addop 'jmp', [0xFF], 4, :setip, :stopexec
addop 'lea', [0x8D], :mrmA
addop 'leave', [0xC9]
addop_macrostr 'lods', [0xAC], :strop
addop 'loop', [0xE2], nil, :setip, :i8
addop 'loopz', [0xE1], nil, :setip, :i8
addop 'loope', [0xE1], nil, :setip, :i8
addop 'loopnz',[0xE0], nil, :setip, :i8
addop 'loopne',[0xE0], nil, :setip, :i8
addop 'mov', [0xA0], nil, {:w => [0, 0], :d => [0, 1]}, :reg_eax, :mrm_imm
addop('mov', [0x88], :mrmw,{:d => [0, 1]}) { |o| o.args.reverse! }
addop 'mov', [0xB0], :reg, {:w => [0, 3]}, :i, :unsigned_imm
addop 'mov', [0xC6], 0, {:w => [0, 0]}, :i, :unsigned_imm
addop_macrostr 'movs', [0xA4], :strop
addop 'movsx', [0x0F, 0xBE], :mrmw
addop 'movzx', [0x0F, 0xB6], :mrmw
addop 'mul', [0xF6], 4, {:w => [0, 0]}
addop 'neg', [0xF6], 3, {:w => [0, 0]}
addop 'nop', [0x90]
addop 'not', [0xF6], 2, {:w => [0, 0]}
addop_macro1 'or', 1, :unsigned_imm
addop 'pop', [0x58], :reg
addop 'pop', [0x8F], 0
addop 'push', [0x50], :reg
addop 'push', [0xFF], 6
addop 'push', [0x68], nil, {:s => [0, 1]}, :i, :unsigned_imm
addop 'ret', [0xC3], nil, :stopexec, :setip
addop 'ret', [0xC2], nil, :stopexec, :u16, :setip
addop_macro3 'rol', 0
addop_macro3 'ror', 1
addop_macro3 'sar', 7
addop_macro1 'sbb', 3
addop_macrostr 'scas', [0xAE], :stropz
addop_macrotttn('set', [0x0F, 0x90], 0) { |o| o.props[:argsz] = 8 }
addop_macrotttn('set', [0x0F, 0x90], :mrm) { |o| o.props[:argsz] = 8 ; o.args.reverse! } # :reg field is unused
addop_macro3 'shl', 4
addop_macro3 'sal', 6
addop 'shld', [0x0F, 0xA4], :mrm, :u8
addop 'shld', [0x0F, 0xA5], :mrm, :reg_cl
addop_macro3 'shr', 5
addop 'shrd', [0x0F, 0xAC], :mrm, :u8
addop 'shrd', [0x0F, 0xAD], :mrm, :reg_cl
addop_macrostr 'stos', [0xAA], :strop
addop_macro1 'sub', 5
addop 'test', [0x84], :mrmw
addop 'test', [0xA8], nil, {:w => [0, 0]}, :reg_eax, :i, :unsigned_imm
addop 'test', [0xF6], 0, {:w => [0, 0]}, :i, :unsigned_imm
addop 'xchg', [0x90], :reg, :reg_eax
addop('xchg', [0x90], :reg, :reg_eax) { |o| o.args.reverse! } # xchg eax, ebx == xchg ebx, eax)
addop 'xchg', [0x86], :mrmw
addop('xchg', [0x86], :mrmw) { |o| o.args.reverse! }
addop_macro1 'xor', 6, :unsigned_imm
end
def init_386_only
init_cpu_constants
addop 'aaa', [0x37]
addop 'aad', [0xD5, 0x0A]
addop 'aam', [0xD4, 0x0A]
addop 'aas', [0x3F]
addop('arpl', [0x63], :mrm) { |o| o.props[:argsz] = 16 ; o.args.reverse! }
addop 'bound', [0x62], :mrmA
addop 'bsf', [0x0F, 0xBC], :mrm
addop 'bsr', [0x0F, 0xBD], :mrm
addop_macro2 'bt' , 0
addop_macro2 'btc', 3
addop_macro2 'btr', 2
addop_macro2 'bts', 1
addop 'call', [0x9A], nil, :stopexec, :setip, :farptr, :saveip
addop 'callf', [0x9A], nil, :stopexec, :setip, :farptr, :saveip
addop 'callf', [0xFF], 3, :stopexec, :setip, :saveip
addop 'clc', [0xF8]
addop 'cld', [0xFC]
addop 'cli', [0xFA]
addop 'clts', [0x0F, 0x06]
addop 'cmc', [0xF5]
addop('cmpxchg',[0x0F, 0xB0], :mrmw) { |o| o.args.reverse! }
addop 'cpuid', [0x0F, 0xA2]
addop 'daa', [0x27]
addop 'das', [0x2F]
addop 'hlt', [0xF4], nil, :stopexec
addop 'in', [0xE4], nil, {:w => [0, 0]}, :reg_eax, :u8
addop 'in', [0xE4], nil, {:w => [0, 0]}, :u8
addop 'in', [0xEC], nil, {:w => [0, 0]}, :reg_eax, :reg_dx
addop 'in', [0xEC], nil, {:w => [0, 0]}, :reg_eax
addop 'in', [0xEC], nil, {:w => [0, 0]}
addop_macrostr 'ins', [0x6C], :strop
addop 'into', [0xCE]
addop 'invd', [0x0F, 0x08]
addop 'invlpg', [0x0F, 0x01, 7<<3], :modrmA
addop('iretd', [0xCF], nil, :stopexec, :setip) { |o| o.props[:opsz] = 32 }
addop_macroret 'iret', [0xCF]
addop('jcxz', [0xE3], nil, :setip, :i8) { |o| o.props[:adsz] = 16 }
addop('jecxz', [0xE3], nil, :setip, :i8) { |o| o.props[:adsz] = 32 }
addop 'jmp', [0xEA], nil, :farptr, :setip, :stopexec
addop 'jmpf', [0xEA], nil, :farptr, :setip, :stopexec
addop 'jmpf', [0xFF], 5, :stopexec, :setip # reg ?
addop 'lahf', [0x9F]
addop 'lar', [0x0F, 0x02], :mrm
addop 'lds', [0xC5], :mrmA
addop 'les', [0xC4], :mrmA
addop 'lfs', [0x0F, 0xB4], :mrmA
addop 'lgs', [0x0F, 0xB5], :mrmA
addop 'lgdt', [0x0F, 0x01], 2, :modrmA
addop 'lidt', [0x0F, 0x01, 3<<3], :modrmA
addop 'lldt', [0x0F, 0x00], 2, :modrmA
addop 'lmsw', [0x0F, 0x01], 6
# prefix addop 'lock', [0xF0]
addop 'lsl', [0x0F, 0x03], :mrm
addop 'lss', [0x0F, 0xB2], :mrmA
addop 'ltr', [0x0F, 0x00], 3
addop 'mov', [0x0F, 0x20, 0xC0], :reg, {:d => [1, 1], :eeec => [2, 3]}, :eeec
addop 'mov', [0x0F, 0x21, 0xC0], :reg, {:d => [1, 1], :eeed => [2, 3]}, :eeed
addop 'mov', [0x0F, 0x24, 0xC0], :reg, {:d => [1, 1], :eeet => [2, 3]}, :eeet
addop 'mov', [0x8C], 0, {:d => [0, 1], :seg3 => [1, 3]}, :seg3
addop 'movbe', [0x0F, 0x38, 0xF0], :mrm, { :d => [2, 0] }
addop 'out', [0xE6], nil, {:w => [0, 0]}, :u8, :reg_eax
addop 'out', [0xE6], nil, {:w => [0, 0]}, :reg_eax, :u8
addop 'out', [0xE6], nil, {:w => [0, 0]}, :u8
addop 'out', [0xEE], nil, {:w => [0, 0]}, :reg_dx, :reg_eax
addop 'out', [0xEE], nil, {:w => [0, 0]}, :reg_eax, :reg_dx
addop 'out', [0xEE], nil, {:w => [0, 0]}, :reg_eax # implicit arguments
addop 'out', [0xEE], nil, {:w => [0, 0]}
addop_macrostr 'outs', [0x6E], :strop
addop 'pop', [0x07], nil, {:seg2A => [0, 3]}, :seg2A
addop 'pop', [0x0F, 0x81], nil, {:seg3A => [1, 3]}, :seg3A
addop('popa', [0x61]) { |o| o.props[:opsz] = 16 }
addop('popad', [0x61]) { |o| o.props[:opsz] = 32 }
addop('popf', [0x9D]) { |o| o.props[:opsz] = 16 }
addop('popfd', [0x9D]) { |o| o.props[:opsz] = 32 }
addop 'push', [0x06], nil, {:seg2 => [0, 3]}, :seg2
addop 'push', [0x0F, 0x80], nil, {:seg3A => [1, 3]}, :seg3A
addop('pusha', [0x60]) { |o| o.props[:opsz] = 16 }
addop('pushad',[0x60]) { |o| o.props[:opsz] = 32 }
addop('pushf', [0x9C]) { |o| o.props[:opsz] = 16 }
addop('pushfd',[0x9C]) { |o| o.props[:opsz] = 32 }
addop_macro3 'rcl', 2
addop_macro3 'rcr', 3
addop 'rdmsr', [0x0F, 0x32]
addop 'rdpmc', [0x0F, 0x33]
addop 'rdtsc', [0x0F, 0x31], nil, :random
addop_macroret 'retf', [0xCB]
addop_macroret 'retf', [0xCA], :u16
addop 'rsm', [0x0F, 0xAA], nil, :stopexec
addop 'sahf', [0x9E]
addop 'sgdt', [0x0F, 0x01, 0<<3], :modrmA
addop 'sidt', [0x0F, 0x01, 1<<3], :modrmA
addop 'sldt', [0x0F, 0x00], 0
addop 'smsw', [0x0F, 0x01], 4
addop 'stc', [0xF9]
addop 'std', [0xFD]
addop 'sti', [0xFB]
addop 'str', [0x0F, 0x00], 1
addop 'test', [0xF6], 1, {:w => [0, 0]}, :i, :unsigned_imm # undocumented alias to F6/0
addop 'ud2', [0x0F, 0x0B]
addop 'verr', [0x0F, 0x00], 4
addop 'verw', [0x0F, 0x00], 5
addop 'wait', [0x9B]
addop 'wbinvd',[0x0F, 0x09]
addop 'wrmsr', [0x0F, 0x30]
addop('xadd', [0x0F, 0xC0], :mrmw) { |o| o.args.reverse! }
addop 'xlat', [0xD7]
# pfx: addrsz = 0x67, lock = 0xF0, opsz = 0x66, repnz = 0xF2, rep/repz = 0xF3
# cs/nojmp = 0x2E, ds/jmp = 0x3E, es = 0x26, fs = 0x64, gs = 0x65, ss = 0x36
# undocumented opcodes
addop 'aam', [0xD4], nil, :u8
addop 'aad', [0xD5], nil, :u8
addop 'setalc',[0xD6]
addop 'salc', [0xD6]
addop 'icebp', [0xF1]
#addop 'loadall',[0x0F, 0x07] # conflict with syscall
addop 'ud0', [0x0F, 0xFF] # amd
addop 'ud2', [0x0F, 0xB9], :mrm
#addop 'umov', [0x0F, 0x10], :mrmw, {:d => [1, 1]} # conflicts with movups/movhlps
end
def init_387_only
init_cpu_constants
addop 'f2xm1', [0xD9, 0xF0]
addop 'fabs', [0xD9, 0xE1]
addop_macrofpu1 'fadd', 0
addop 'faddp', [0xDE, 0xC0], :regfp
addop 'faddp', [0xDE, 0xC1]
addop('fbld', [0xDF, 4<<3], :modrmA, :regfp0) { |o| o.props[:argsz] = 80 }
addop('fbstp', [0xDF, 6<<3], :modrmA, :regfp0) { |o| o.props[:argsz] = 80 }
addop 'fchs', [0xD9, 0xE0], nil, :regfp0
addop 'fnclex', [0xDB, 0xE2]
addop_macrofpu1 'fcom', 2
addop_macrofpu1 'fcomp', 3
addop 'fcompp',[0xDE, 0xD9]
addop 'fcomip',[0xDF, 0xF0], :regfp
addop 'fcos', [0xD9, 0xFF], nil, :regfp0
addop 'fdecstp', [0xD9, 0xF6]
addop_macrofpu1 'fdiv', 6
addop_macrofpu1 'fdivr', 7
addop 'fdivp', [0xDE, 0xF8], :regfp
addop 'fdivp', [0xDE, 0xF9]
addop 'fdivrp',[0xDE, 0xF0], :regfp
addop 'fdivrp',[0xDE, 0xF1]
addop 'ffree', [0xDD, 0xC0], nil, {:regfp => [1, 0]}, :regfp
addop_macrofpu2 'fiadd', 0
addop_macrofpu2 'fimul', 1
addop_macrofpu2 'ficom', 2
addop_macrofpu2 'ficomp',3
addop_macrofpu2 'fisub', 4
addop_macrofpu2 'fisubr',5
addop_macrofpu2 'fidiv', 6
addop_macrofpu2 'fidivr',7
addop 'fincstp', [0xD9, 0xF7]
addop 'fninit', [0xDB, 0xE3]
addop_macrofpu2 'fist', 2, 1
addop_macrofpu3 'fild', 0
addop_macrofpu3 'fistp',3
addop('fld', [0xD9, 0<<3], :modrmA, :regfp0) { |o| o.props[:argsz] = 32 }
addop('fld', [0xDD, 0<<3], :modrmA, :regfp0) { |o| o.props[:argsz] = 64 }
addop('fld', [0xDB, 5<<3], :modrmA, :regfp0) { |o| o.props[:argsz] = 80 }
addop 'fld', [0xD9, 0xC0], :regfp
addop('fldcw', [0xD9, 5<<3], :modrmA) { |o| o.props[:argsz] = 16 }
addop 'fldenv', [0xD9, 4<<3], :modrmA
addop 'fld1', [0xD9, 0xE8]
addop 'fldl2t', [0xD9, 0xE9]
addop 'fldl2e', [0xD9, 0xEA]
addop 'fldpi', [0xD9, 0xEB]
addop 'fldlg2', [0xD9, 0xEC]
addop 'fldln2', [0xD9, 0xED]
addop 'fldz', [0xD9, 0xEE]
addop_macrofpu1 'fmul', 1
addop 'fmulp', [0xDE, 0xC8], :regfp
addop 'fmulp', [0xDE, 0xC9]
addop 'fnop', [0xD9, 0xD0]
addop 'fpatan', [0xD9, 0xF3]
addop 'fprem', [0xD9, 0xF8]
addop 'fprem1', [0xD9, 0xF5]
addop 'fptan', [0xD9, 0xF2]
addop 'frndint',[0xD9, 0xFC]
addop 'frstor', [0xDD, 4<<3], :modrmA
addop 'fnsave', [0xDD, 6<<3], :modrmA
addop('fnstcw', [0xD9, 7<<3], :modrmA) { |o| o.props[:argsz] = 16 }
addop 'fnstenv',[0xD9, 6<<3], :modrmA
addop 'fnstsw', [0xDF, 0xE0]
addop('fnstsw', [0xDD, 7<<3], :modrmA) { |o| o.props[:argsz] = 16 }
addop 'fscale', [0xD9, 0xFD]
addop 'fsin', [0xD9, 0xFE]
addop 'fsincos',[0xD9, 0xFB]
addop 'fsqrt', [0xD9, 0xFA]
addop('fst', [0xD9, 2<<3], :modrmA, :regfp0) { |o| o.props[:argsz] = 32 }
addop('fst', [0xDD, 2<<3], :modrmA, :regfp0) { |o| o.props[:argsz] = 64 }
addop 'fst', [0xD9, 0xD0], :regfp
addop('fstp', [0xD9, 3<<3], :modrmA, :regfp0) { |o| o.props[:argsz] = 32 }
addop('fstp', [0xDD, 3<<3], :modrmA, :regfp0) { |o| o.props[:argsz] = 64 }
addop('fstp', [0xDB, 7<<3], :modrmA, :regfp0) { |o| o.props[:argsz] = 80 }
addop 'fstp', [0xDD, 0xD8], :regfp
addop_macrofpu1 'fsub', 4
addop 'fsubp', [0xDE, 0xE8], :regfp
addop 'fsubp', [0xDE, 0xE9]
addop_macrofpu1 'fsubp', 5
addop 'fsubrp', [0xDE, 0xE0], :regfp
addop 'fsubrp', [0xDE, 0xE1]
addop 'ftst', [0xD9, 0xE4]
addop 'fucom', [0xDD, 0xE0], :regfp
addop 'fucomp', [0xDD, 0xE8], :regfp
addop 'fucompp',[0xDA, 0xE9]
addop 'fucomi', [0xDB, 0xE8], :regfp
addop 'fxam', [0xD9, 0xE5]
addop 'fxch', [0xD9, 0xC8], :regfp
addop 'fxtract',[0xD9, 0xF4]
addop 'fyl2x', [0xD9, 0xF1]
addop 'fyl2xp1',[0xD9, 0xF9]
# fwait prefix
addop 'fclex', [0x9B, 0xDB, 0xE2]
addop 'finit', [0x9B, 0xDB, 0xE3]
addop 'fsave', [0x9B, 0xDD, 6<<3], :modrmA
addop('fstcw', [0x9B, 0xD9, 7<<3], :modrmA) { |o| o.props[:argsz] = 16 }
addop 'fstenv', [0x9B, 0xD9, 6<<3], :modrmA
addop 'fstsw', [0x9B, 0xDF, 0xE0]
addop('fstsw', [0x9B, 0xDD, 7<<3], :modrmA) { |o| o.props[:argsz] = 16 }
addop 'fwait', [0x9B]
end
def init_486_only
init_cpu_constants
end
def init_pentium_only
init_cpu_constants
addop('cmpxchg8b', [0x0F, 0xC7], 1) { |o| o.props[:opsz] = 32 ; o.props[:argsz] = 64 }
# lock cmpxchg8b eax
#addop 'f00fbug', [0xF0, 0x0F, 0xC7, 0xC8]
# mmx
addop 'emms', [0x0F, 0x77]
addop('movd', [0x0F, 0x6E], :mrmmmx, {:d => [1, 4]}) { |o| o.args = [:modrm, :regmmx] ; o.props[:opsz] = o.props[:argsz] = 32 }
addop('movq', [0x0F, 0x6F], :mrmmmx, {:d => [1, 4]}) { |o| o.props[:argsz] = 64 }
addop 'packssdw', [0x0F, 0x6B], :mrmmmx
addop 'packsswb', [0x0F, 0x63], :mrmmmx
addop 'packuswb', [0x0F, 0x67], :mrmmmx
addop_macrogg 0..2, 'padd', [0x0F, 0xFC], :mrmmmx
addop_macrogg 0..1, 'padds', [0x0F, 0xEC], :mrmmmx
addop_macrogg 0..1, 'paddus',[0x0F, 0xDC], :mrmmmx
addop 'pand', [0x0F, 0xDB], :mrmmmx
addop 'pandn', [0x0F, 0xDF], :mrmmmx
addop_macrogg 0..2, 'pcmpeq',[0x0F, 0x74], :mrmmmx
addop_macrogg 0..2, 'pcmpgt',[0x0F, 0x64], :mrmmmx
addop 'pmaddwd', [0x0F, 0xF5], :mrmmmx
addop 'pmulhuw', [0x0F, 0xE4], :mrmmmx
addop 'pmulhw',[0x0F, 0xE5], :mrmmmx
addop 'pmullw',[0x0F, 0xD5], :mrmmmx
addop 'por', [0x0F, 0xEB], :mrmmmx
[[1..3, 'psll', 3], [1..2, 'psra', 2], [1..3, 'psrl', 1]].each { |ggrng, name, val|
addop_macrogg ggrng, name, [0x0F, 0xC0 | (val << 4)], :mrmmmx
addop_macrogg ggrng, name, [0x0F, 0x70, 0xC0 | (val << 4)], nil, {:regmmx => [2, 0]}, :regmmx, :u8
}
addop_macrogg 0..2, 'psub', [0x0F, 0xF8], :mrmmmx
addop_macrogg 0..1, 'psubs', [0x0F, 0xE8], :mrmmmx
addop_macrogg 0..1, 'psubus',[0x0F, 0xD8], :mrmmmx
addop_macrogg 1..3, 'punpckh', [0x0F, 0x68], :mrmmmx
addop_macrogg 1..3, 'punpckl', [0x0F, 0x60], :mrmmmx
addop 'pxor', [0x0F, 0xEF], :mrmmmx
end
def init_p6_only
addop_macrotttn 'cmov', [0x0F, 0x40], :mrm
%w{b e be u}.each_with_index { |tt, i|
addop 'fcmov' + tt, [0xDA, 0xC0 | (i << 3)], :regfp
addop 'fcmovn'+ tt, [0xDB, 0xC0 | (i << 3)], :regfp
}
addop 'fcomi', [0xDB, 0xF0], :regfp
addop('fxrstor', [0x0F, 0xAE, 1<<3], :modrmA) { |o| o.props[:argsz] = 512*8 }
addop('fxsave', [0x0F, 0xAE, 0<<3], :modrmA) { |o| o.props[:argsz] = 512*8 }
addop 'sysenter',[0x0F, 0x34]
addop 'sysexit', [0x0F, 0x35]
addop 'syscall', [0x0F, 0x05] # AMD
addop_macroret 'sysret', [0x0F, 0x07] # AMD
end
def init_3dnow_only
init_cpu_constants
[['pavgusb', 0xBF], ['pfadd', 0x9E], ['pfsub', 0x9A],
['pfsubr', 0xAA], ['pfacc', 0xAE], ['pfcmpge', 0x90],
['pfcmpgt', 0xA0], ['fpcmpeq', 0xB0], ['pfmin', 0x94],
['pfmax', 0xA4], ['pi2fd', 0x0D], ['pf2id', 0x1D],
['pfrcp', 0x96], ['pfrsqrt', 0x97], ['pfmul', 0xB4],
['pfrcpit1', 0xA6], ['pfrsqit1', 0xA7], ['pfrcpit2', 0xB6],
['pmulhrw', 0xB7]].each { |str, bin|
addop str, [0x0F, 0x0F, bin], :mrmmmx
}
# 3dnow prefix fallback
addop '3dnow', [0x0F, 0x0F], :mrmmmx, :u8
addop 'femms', [0x0F, 0x0E]
addop 'prefetch', [0x0F, 0x0D, 0<<3], :modrmA
addop 'prefetchw', [0x0F, 0x0D, 1<<3], :modrmA
end
def init_sse_only
init_cpu_constants
addop_macrossps 'addps', [0x0F, 0x58], :mrmxmm
addop 'andnps', [0x0F, 0x55], :mrmxmm
addop 'andps', [0x0F, 0x54], :mrmxmm
addop_macrossps 'cmpps', [0x0F, 0xC2], :mrmxmm, :u8
addop 'comiss', [0x0F, 0x2F], :mrmxmm
addop('cvtpi2ps', [0x0F, 0x2A], :mrmxmm) { |o| o.args[o.args.index(:modrmxmm)] = :modrmmmx }
addop('cvtps2pi', [0x0F, 0x2D], :mrmmmx) { |o| o.args[o.args.index(:modrmmmx)] = :modrmxmm }
addop('cvtsi2ss', [0x0F, 0x2A], :mrmxmm) { |o| o.args[o.args.index(:modrmxmm)] = :modrm ; o.props[:needpfx] = 0xF3 }
addop('cvtss2si', [0x0F, 0x2D], :mrm) { |o| o.args[o.args.index(:modrm)] = :modrmxmm ; o.props[:needpfx] = 0xF3 }
addop('cvttps2pi',[0x0F, 0x2C], :mrmmmx) { |o| o.args[o.args.index(:modrmmmx)] = :modrmxmm }
addop('cvttss2si',[0x0F, 0x2C], :mrm) { |o| o.args[o.args.index(:modrm)] = :modrmxmm ; o.props[:needpfx] = 0xF3 }
addop_macrossps 'divps', [0x0F, 0x5E], :mrmxmm
addop 'ldmxcsr', [0x0F, 0xAE, 2<<3], :modrmA
addop_macrossps 'maxps', [0x0F, 0x5F], :mrmxmm
addop_macrossps 'minps', [0x0F, 0x5D], :mrmxmm
addop 'movaps', [0x0F, 0x28], :mrmxmm, {:d => [1, 0]}
addop 'movhlps', [0x0F, 0x12], :mrmxmm, :modrmR
addop 'movlps', [0x0F, 0x12], :mrmxmm, {:d => [1, 0]}, :modrmA
addop 'movlhps', [0x0F, 0x16], :mrmxmm, :modrmR
addop 'movhps', [0x0F, 0x16], :mrmxmm, {:d => [1, 0]}, :modrmA
addop 'movmskps',[0x0F, 0x50, 0xC0], nil, {:reg => [2, 3], :regxmm => [2, 0]}, :regxmm, :reg
addop('movss', [0x0F, 0x10], :mrmxmm, {:d => [1, 0]}) { |o| o.props[:needpfx] = 0xF3 }
addop 'movups', [0x0F, 0x10], :mrmxmm, {:d => [1, 0]}
addop_macrossps 'mulps', [0x0F, 0x59], :mrmxmm
addop 'orps', [0x0F, 0x56], :mrmxmm
addop_macrossps 'rcpps', [0x0F, 0x53], :mrmxmm
addop_macrossps 'rsqrtps',[0x0F, 0x52], :mrmxmm
addop 'shufps', [0x0F, 0xC6], :mrmxmm, :u8
addop_macrossps 'sqrtps', [0x0F, 0x51], :mrmxmm
addop 'stmxcsr', [0x0F, 0xAE, 3<<3], :modrmA
addop_macrossps 'subps', [0x0F, 0x5C], :mrmxmm
addop 'ucomiss', [0x0F, 0x2E], :mrmxmm
addop 'unpckhps',[0x0F, 0x15], :mrmxmm
addop 'unpcklps',[0x0F, 0x14], :mrmxmm
addop 'xorps', [0x0F, 0x57], :mrmxmm
# integer instrs, mmx only
addop 'pavgb', [0x0F, 0xE0], :mrmmmx
addop 'pavgw', [0x0F, 0xE3], :mrmmmx
addop 'pextrw', [0x0F, 0xC5, 0xC0], nil, {:reg => [2, 3], :regmmx => [2, 0]}, :reg, :regmmx, :u8
addop 'pinsrw', [0x0F, 0xC4, 0x00], nil, {:modrm => [2, 0], :regmmx => [2, 3]}, :modrm, :regmmx, :u8
addop 'pmaxsw', [0x0F, 0xEE], :mrmmmx
addop 'pmaxub', [0x0F, 0xDE], :mrmmmx
addop 'pminsw', [0x0F, 0xEA], :mrmmmx
addop 'pminub', [0x0F, 0xDA], :mrmmmx
addop 'pmovmskb',[0x0F, 0xD7, 0xC0], nil, {:reg => [2, 3], :regmmx => [2, 0]}, :reg, :regmmx
addop 'psadbw', [0x0F, 0xF6], :mrmmmx
addop 'pshufw', [0x0F, 0x70], :mrmmmx, :u8
addop 'maskmovq',[0x0F, 0xF7], :mrmmmx, :modrmR
addop('movntq', [0x0F, 0xE7], :mrmmmx) { |o| o.args.reverse! }
addop('movntps', [0x0F, 0x2B], :mrmxmm) { |o| o.args.reverse! }
addop 'prefetcht0', [0x0F, 0x18, 1<<3], :modrmA
addop 'prefetcht1', [0x0F, 0x18, 2<<3], :modrmA
addop 'prefetcht2', [0x0F, 0x18, 3<<3], :modrmA
addop 'prefetchnta',[0x0F, 0x18, 0<<3], :modrmA
addop 'sfence', [0x0F, 0xAE, 0xF8]
# the whole row of prefetch is actually nops
addop 'nop', [0x0F, 0x1C], :mrmw, :d => [1, 1] # incl. official version = 0f1f mrm
addop 'nop_8', [0x0F, 0x18], :mrmw, :d => [1, 1]
addop 'nop_d', [0x0F, 0x0D], :mrm
addop 'nop', [0x0F, 0x1C], 0 # official asm syntax is 'nop [eax]'
end
def init_sse2_only
init_cpu_constants
@opcode_list.each { |o| o.props[:xmmx] = true if o.fields[:regmmx] and o.name !~ /^(?:mov(?:nt)?q|pshufw|cvt.*)$/ }
# mirror of the init_sse part
addop_macrosdpd 'addpd', [0x0F, 0x58], :mrmxmm
addop('andnpd', [0x0F, 0x55], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('andpd', [0x0F, 0x54], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop_macrosdpd 'cmppd', [0x0F, 0xC2], :mrmxmm, :u8
addop('comisd', [0x0F, 0x2F], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('cvtpi2pd', [0x0F, 0x2A], :mrmxmm) { |o| o.args[o.args.index(:modrmxmm)] = :modrmmmx ; o.props[:needpfx] = 0x66 }
addop('cvtpd2pi', [0x0F, 0x2D], :mrmmmx) { |o| o.args[o.args.index(:modrmmmx)] = :modrmxmm ; o.props[:needpfx] = 0x66 }
addop('cvtsi2sd', [0x0F, 0x2A], :mrmxmm) { |o| o.args[o.args.index(:modrmxmm)] = :modrm ; o.props[:needpfx] = 0xF2 }
addop('cvtsd2si', [0x0F, 0x2D], :mrm ) { |o| o.args[o.args.index(:modrm )] = :modrmxmm ; o.props[:needpfx] = 0xF2 }
addop('cvttpd2pi',[0x0F, 0x2C], :mrmmmx) { |o| o.args[o.args.index(:modrmmmx)] = :modrmxmm ; o.props[:needpfx] = 0x66 }
addop('cvttsd2si',[0x0F, 0x2C], :mrm ) { |o| o.args[o.args.index(:modrm )] = :modrmxmm ; o.props[:needpfx] = 0xF2 }
addop('cvtpd2ps', [0x0F, 0x5A], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('cvtps2pd', [0x0F, 0x5A], :mrmxmm)
addop('cvtsd2ss', [0x0F, 0x5A], :mrmxmm) { |o| o.props[:needpfx] = 0xF2 }
addop('cvtss2sd', [0x0F, 0x5A], :mrmxmm) { |o| o.props[:needpfx] = 0xF3 }
addop('cvtpd2dq', [0x0F, 0xE6], :mrmxmm) { |o| o.props[:needpfx] = 0xF2 }
addop('cvttpd2dq',[0x0F, 0xE6], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('cvtdq2pd', [0x0F, 0xE6], :mrmxmm) { |o| o.props[:needpfx] = 0xF3 }
addop('cvtps2dq', [0x0F, 0x5B], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('cvttps2dq',[0x0F, 0x5B], :mrmxmm) { |o| o.props[:needpfx] = 0xF3 }
addop('cvtdq2ps', [0x0F, 0x5B], :mrmxmm)
addop_macrosdpd 'divpd', [0x0F, 0x5E], :mrmxmm
addop_macrosdpd 'maxpd', [0x0F, 0x5F], :mrmxmm
addop_macrosdpd 'minpd', [0x0F, 0x5D], :mrmxmm
addop('movapd', [0x0F, 0x28], :mrmxmm, {:d => [1, 0]}) { |o| o.props[:needpfx] = 0x66 }
addop('movlpd', [0x0F, 0x12], :mrmxmm, {:d => [1, 0]}) { |o| o.props[:needpfx] = 0x66 }
addop('movhpd', [0x0F, 0x16], :mrmxmm, {:d => [1, 0]}) { |o| o.props[:needpfx] = 0x66 }
addop('movmskpd',[0x0F, 0x50, 0xC0], nil, {:reg => [2, 3], :regxmm => [2, 0]}, :regxmm, :reg) { |o| o.props[:needpfx] = 0x66 }
addop('movsd', [0x0F, 0x10], :mrmxmm, {:d => [1, 0]}) { |o| o.props[:needpfx] = 0xF2 }
addop('movupd', [0x0F, 0x10], :mrmxmm, {:d => [1, 0]}) { |o| o.props[:needpfx] = 0x66 }
addop_macrosdpd 'mulpd', [0x0F, 0x59], :mrmxmm
addop('orpd', [0x0F, 0x56], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('shufpd', [0x0F, 0xC6], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
addop_macrosdpd 'sqrtpd', [0x0F, 0x51], :mrmxmm
addop_macrosdpd 'subpd', [0x0F, 0x5C], :mrmxmm
addop('ucomisd', [0x0F, 0x2E], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('unpckhpd',[0x0F, 0x15], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('unpcklpd',[0x0F, 0x14], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('xorpd', [0x0F, 0x57], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('movdqa', [0x0F, 0x6F], :mrmxmm, {:d => [1, 4]}) { |o| o.props[:needpfx] = 0x66 }
addop('movdqu', [0x0F, 0x6F], :mrmxmm, {:d => [1, 4]}) { |o| o.props[:needpfx] = 0xF3 }
addop('movq2dq', [0x0F, 0xD6], :mrmxmm, :modrmR) { |o| o.args[o.args.index(:modrmxmm)] = :modrmmmx ; o.props[:needpfx] = 0xF3 }
addop('movdq2q', [0x0F, 0xD6], :mrmmmx, :modrmR) { |o| o.args[o.args.index(:modrmmmx)] = :modrmxmm ; o.props[:needpfx] = 0xF2 }
addop('movq', [0x0F, 0x7E], :mrmxmm) { |o| o.props[:needpfx] = 0xF3 ; o.props[:argsz] = 128 }
addop('movq', [0x0F, 0xD6], :mrmxmm) { |o| o.args.reverse! ; o.props[:needpfx] = 0x66 ; o.props[:argsz] = 128 }
addop 'paddq', [0x0F, 0xD4], :mrmmmx, :xmmx
addop 'pmuludq', [0x0F, 0xF4], :mrmmmx, :xmmx
addop('pshuflw', [0x0F, 0x70], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0xF2 }
addop('pshufhw', [0x0F, 0x70], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0xF3 }
addop('pshufd', [0x0F, 0x70], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
addop('pslldq', [0x0F, 0x73, 0xF8], nil, {:regxmm => [2, 0]}, :regxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
addop('psrldq', [0x0F, 0x73, 0xD8], nil, {:regxmm => [2, 0]}, :regxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
addop 'psubq', [0x0F, 0xFB], :mrmmmx, :xmmx
addop('punpckhqdq', [0x0F, 0x6D], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('punpcklqdq', [0x0F, 0x6C], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('clflush', [0x0F, 0xAE, 7<<3], :modrmA) { |o| o.props[:argsz] = 8 }
addop('maskmovdqu', [0x0F, 0xF7], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('movntpd', [0x0F, 0x2B], :mrmxmm) { |o| o.args.reverse! ; o.props[:needpfx] = 0x66 }
addop('movntdq', [0x0F, 0xE7], :mrmxmm) { |o| o.args.reverse! ; o.props[:needpfx] = 0x66 }
addop('movnti', [0x0F, 0xC3], :mrm) { |o| o.args.reverse! }
addop('pause', [0x90]) { |o| o.props[:needpfx] = 0xF3 }
addop 'lfence', [0x0F, 0xAE, 0xE8]
addop 'mfence', [0x0F, 0xAE, 0xF0]
end
def init_sse3_only
init_cpu_constants
addop('addsubpd', [0x0F, 0xD0], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('addsubps', [0x0F, 0xD0], :mrmxmm) { |o| o.props[:needpfx] = 0xF2 }
addop('haddpd', [0x0F, 0x7C], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('haddps', [0x0F, 0x7C], :mrmxmm) { |o| o.props[:needpfx] = 0xF2 }
addop('hsubpd', [0x0F, 0x7D], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('hsubps', [0x0F, 0x7D], :mrmxmm) { |o| o.props[:needpfx] = 0xF2 }
addop 'monitor', [0x0F, 0x01, 0xC8]
addop 'mwait', [0x0F, 0x01, 0xC9]
addop('fisttp', [0xDF, 1<<3], :modrmA) { |o| o.props[:argsz] = 16 }
addop('fisttp', [0xDB, 1<<3], :modrmA) { |o| o.props[:argsz] = 32 }
addop('fisttp', [0xDD, 1<<3], :modrmA) { |o| o.props[:argsz] = 64 }
addop('lddqu', [0x0F, 0xF0], :mrmxmm, :modrmA) { |o| o.args[o.args.index(:modrmxmm)] = :modrm ; o.props[:needpfx] = 0xF2 }
addop('movddup', [0x0F, 0x12], :mrmxmm) { |o| o.props[:needpfx] = 0xF2 }
addop('movshdup', [0x0F, 0x16], :mrmxmm) { |o| o.props[:needpfx] = 0xF3 }
addop('movsldup', [0x0F, 0x12], :mrmxmm) { |o| o.props[:needpfx] = 0xF3 }
end
def init_ssse3_only
init_cpu_constants
addop_macrogg 0..2, 'pabs', [0x0F, 0x38, 0x1C], :mrmmmx, :xmmx
addop 'palignr', [0x0F, 0x3A, 0x0F], :mrmmmx, :u8, :xmmx
addop 'phaddd', [0x0F, 0x38, 0x02], :mrmmmx, :xmmx
addop 'phaddsw', [0x0F, 0x38, 0x03], :mrmmmx, :xmmx
addop 'phaddw', [0x0F, 0x38, 0x01], :mrmmmx, :xmmx
addop 'phsubd', [0x0F, 0x38, 0x06], :mrmmmx, :xmmx
addop 'phsubsw', [0x0F, 0x38, 0x07], :mrmmmx, :xmmx
addop 'phsubw', [0x0F, 0x38, 0x05], :mrmmmx, :xmmx
addop 'pmaddubsw',[0x0F, 0x38, 0x04], :mrmmmx, :xmmx
addop 'pmulhrsw', [0x0F, 0x38, 0x0B], :mrmmmx, :xmmx
addop 'pshufb', [0x0F, 0x38, 0x00], :mrmmmx, :xmmx
addop_macrogg 0..2, 'psignb', [0x0F, 0x38, 0x80], :mrmmmx, :xmmx
end
def init_aesni_only
init_cpu_constants
addop('aesdec', [0x0F, 0x38, 0xDE], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('aesdeclast',[0x0F, 0x38, 0xDF], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('aesenc', [0x0F, 0x38, 0xDC], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('aesenclast',[0x0F, 0x38, 0xDD], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('aesimc', [0x0F, 0x38, 0xDB], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('aeskeygenassist', [0x0F, 0x3A, 0xDF], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
addop('pclmulqdq', [0x0F, 0x3A, 0x44], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
end
def init_vmx_only
init_cpu_constants
addop 'vmcall', [0x0F, 0x01, 0xC1]
addop 'vmlaunch', [0x0F, 0x01, 0xC2]
addop 'vmresume', [0x0F, 0x01, 0xC3]
addop 'vmxoff', [0x0F, 0x01, 0xC4]
addop 'vmread', [0x0F, 0x78], :mrm
addop 'vmwrite', [0x0F, 0x79], :mrm
addop('vmclear', [0x0F, 0xC7, 6<<3], :modrmA) { |o| o.props[:argsz] = 64 ; o.props[:needpfx] = 0x66 }
addop('vmxon', [0x0F, 0xC7, 6<<3], :modrmA) { |o| o.props[:argsz] = 64 ; o.props[:needpfx] = 0xF3 }
addop('vmptrld', [0x0F, 0xC7, 6<<3], :modrmA) { |o| o.props[:argsz] = 64 }
addop('vmptrrst', [0x0F, 0xC7, 7<<3], :modrmA) { |o| o.props[:argsz] = 64 }
addop('invept', [0x0F, 0x38, 0x80], :mrmA) { |o| o.props[:needpfx] = 0x66 }
addop('invvpid', [0x0F, 0x38, 0x81], :mrmA) { |o| o.props[:needpfx] = 0x66 }
addop 'getsec', [0x0F, 0x37]
addop 'xgetbv', [0x0F, 0x01, 0xD0]
addop 'xsetbv', [0x0F, 0x01, 0xD1]
addop 'rdtscp', [0x0F, 0x01, 0xF9]
addop 'xrstor', [0x0F, 0xAE, 5<<3], :modrmA
addop 'xsave', [0x0F, 0xAE, 4<<3], :modrmA
end
def init_sse41_only
init_cpu_constants
addop('blendpd', [0x0F, 0x3A, 0x0D], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('blendps', [0x0F, 0x3A, 0x0C], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('blendvpd', [0x0F, 0x38, 0x15], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('blendvps', [0x0F, 0x38, 0x14], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('dppd', [0x0F, 0x3A, 0x41], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
addop('dpps', [0x0F, 0x3A, 0x40], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
addop('extractps',[0x0F, 0x3A, 0x17], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
addop('insertps', [0x0F, 0x3A, 0x21], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
addop('movntdqa', [0x0F, 0x38, 0x2A], :mrmxmm, :modrmA) { |o| o.props[:needpfx] = 0x66 }
addop('mpsadbw', [0x0F, 0x3A, 0x42], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
addop('packusdw', [0x0F, 0x38, 0x2B], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pblendvb', [0x0F, 0x38, 0x10], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pblendw', [0x0F, 0x3A, 0x1E], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
addop('pcmpeqq', [0x0F, 0x38, 0x29], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pextrb', [0x0F, 0x3A, 0x14], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66; o.args[o.args.index(:modrmxmm)] = :modrm; o.props[:argsz] = 8 }
addop('pextrw', [0x0F, 0x3A, 0x15], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66; o.args[o.args.index(:modrmxmm)] = :modrm; o.props[:argsz] = 16 }
addop('pextrd', [0x0F, 0x3A, 0x16], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66; o.args[o.args.index(:modrmxmm)] = :modrm; o.props[:argsz] = 32 }
addop('pinsrb', [0x0F, 0x3A, 0x20], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66; o.args[o.args.index(:modrmxmm)] = :modrm; o.props[:argsz] = 8 }
addop('pinsrw', [0x0F, 0x3A, 0x21], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66; o.args[o.args.index(:modrmxmm)] = :modrm; o.props[:argsz] = 16 }
addop('pinsrd', [0x0F, 0x3A, 0x22], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66; o.args[o.args.index(:modrmxmm)] = :modrm; o.props[:argsz] = 32 }
addop('phminposuw', [0x0F, 0x38, 0x41], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pminsb', [0x0F, 0x38, 0x38], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pminsd', [0x0F, 0x38, 0x39], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pminuw', [0x0F, 0x38, 0x3A], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pminud', [0x0F, 0x38, 0x3B], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmaxsb', [0x0F, 0x38, 0x3C], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmaxsd', [0x0F, 0x38, 0x3D], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmaxuw', [0x0F, 0x38, 0x3E], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmaxud', [0x0F, 0x38, 0x3F], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmovsxbw', [0x0F, 0x38, 0x20], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmovsxbd', [0x0F, 0x38, 0x21], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmovsxbq', [0x0F, 0x38, 0x22], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmovsxwd', [0x0F, 0x38, 0x23], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmovsxwq', [0x0F, 0x38, 0x24], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmovsxdq', [0x0F, 0x38, 0x25], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmovzxbw', [0x0F, 0x38, 0x30], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmovzxbd', [0x0F, 0x38, 0x31], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmovzxbq', [0x0F, 0x38, 0x32], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmovzxwd', [0x0F, 0x38, 0x33], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmovzxwq', [0x0F, 0x38, 0x34], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmovzxdq', [0x0F, 0x38, 0x35], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmuldq', [0x0F, 0x38, 0x28], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('pmulld', [0x0F, 0x38, 0x40], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('ptest', [0x0F, 0x38, 0x17], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('roundps', [0x0F, 0x3A, 0x08], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
addop('roundpd', [0x0F, 0x3A, 0x09], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
addop('roundss', [0x0F, 0x3A, 0x0A], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
addop('roundsd', [0x0F, 0x3A, 0x0B], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66 }
end
def init_sse42_only
init_cpu_constants
addop('crc32', [0x0F, 0x38, 0xF0], :mrmw) { |o| o.props[:needpfx] = 0xF2 }
addop('pcmpestrm', [0x0F, 0x3A, 0x60], :mrmxmm, :i8) { |o| o.props[:needpfx] = 0x66 }
addop('pcmpestri', [0x0F, 0x3A, 0x61], :mrmxmm, :i8) { |o| o.props[:needpfx] = 0x66 }
addop('pcmpistrm', [0x0F, 0x3A, 0x62], :mrmxmm, :i8) { |o| o.props[:needpfx] = 0x66 }
addop('pcmpistri', [0x0F, 0x3A, 0x63], :mrmxmm, :i8) { |o| o.props[:needpfx] = 0x66 }
addop('pcmpgtq', [0x0F, 0x38, 0x37], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('popcnt', [0x0F, 0xB8], :mrm) { |o| o.props[:needpfx] = 0xF3 }
end
def init_avx_only
init_cpu_constants
add128 = {}
add256 = {}
%w[movss movsd movlhps movhpd movhlps
cvtsi2ss cvtsi2sd sqrtss sqrtsd rsqrtss rcpss
addss addsd mulss mulsd cvtss2sd cvtsd2ss subss subsd
minss minsd divss divsd maxss maxsd
punpcklb punpcklw punpckld packsswb pcmpgtb pcmpgtw pcmpgtd packuswb
punpckhb punpckhw punpckhd packssdw punpcklq punpckhq
pcmpeqb pcmpeqw pcmpeqd ldmxcsr stmxcsr
cmpss cmpsd paddq pmullw psubusb psubusw pminub
pand paddusb paddusw pmaxub pandn pavgb pavgw
pmulhuw pmulhw psubsb psubsw pminsw por paddsb paddsw pmaxsw pxor
pmuludq pmaddwd psadbw
psubb psubw psubd psubq paddb paddw paddd
phaddw phaddsw phaddd phsubw phsubsw phsubd
pmaddubsw palignr pshufb pmulhrsw psignb psignw psignd
dppd insertps mpsadbw packusdw pblendw pcmpeqq
pinsrb pinsrw pinsrd pinsrq
pmaxsb pmaxsd pmaxud pmaxuw pminsb pminsd pminud pminuw
pmuldq pmulld roundsd roundss pcmpgtq
aesdec aesdeclast aesenc aesenclast
pclmulqdq punpcklbw punpcklwd punpckldq punpckhbw punpckhwd
punpckhdq punpcklqdq punpckhqdq].each { |n| add128[n] = true }
%w[movups movupd movddup movsldup
unpcklps unpcklpd unpckhps unpckhpd
movaps movshdup movapd movntps movntpd movmskps movmskpd
sqrtps sqrtpd rsqrtps rcpps andps andpd andnps andnpd
orps orpd xorps xorpd addps addpd mulps mulpd
cvtps2pd cvtpd2ps cvtdq2ps cvtps2dq cvttps2dq
subps subpd minps minpd divps divpd maxps maxpd
movdqa movdqu haddpd haddps hsubpd hsubps
cmpps cmppd shufps shufpd addsubpd addsubps
cvtpd2dq cvttpd2dq cvtdq2pd movntdq lddqu
blendps blendpd blendvps blendvpd dpps ptest
roundpd roundps].each { |n| add128[n] = add256[n] = true }
varg = Hash.new(1)
%w[pabsb pabsw pabsd pmovmskb pshufd pshufhw pshuflw movntdqa
pmovsxbw pmovsxbd pmovsxbq pmovsxwd pmovsxwq pmovsxdq
pmovzxbw pmovzxbd pmovzxbq pmovzxwd pmovzxwq pmovzxdq
aesimc aeskeygenassist lddqu maskmovdqu movapd movaps
pcmpestri pcmpestrm pcmpistri pcmpistrm phminposuw
cvtpd2dq cvttpd2dq cvtdq2pd cvtps2pd cvtpd2ps cvtdq2ps cvtps2dq
cvttps2dq movd movq movddup movdqa movdqu movmskps movmskpd
movntdq movntps movntpd movshdup movsldup movups movupd
pextrb pextrw pextrd pextrq ptest rcpps roundps roundpd
extractps sqrtps sqrtpd comiss comisd ucomiss ucomisd
cvttss2si cvttsd2si cvtss2si cvtsd2si
].each { |n| add128[n] = true ; varg[n] = nil }
cvtarg128 = { :regmmx => :regxmm, :modrmmmx => :modrmxmm }
cvtarg256 = { :regmmx => :regymm, :modrmmmx => :modrmymm,
:regxmm => :regymm, :modrmxmm => :modrmymm }
# autopromote old sseX opcodes
@opcode_list.each { |o|
next if o.bin[0] != 0x0F or not add128[o.name] # rep cmpsd / movsd
mm = (o.bin[1] == 0x38 ? 0x0F38 : o.bin[1] == 0x3A ? 0x0F3A : 0x0F)
pp = o.props[:needpfx]
pp = 0x66 if o.props[:xmmx]
fpxlen = (mm == 0x0F ? 1 : 2)
addop_vex('v' + o.name, [varg[o.name], 128, pp, mm], o.bin[fpxlen], nil, *o.args.map { |oa| cvtarg128[oa] || oa }) { |oo|
oo.bin += [o.bin[fpxlen+1]] if o.bin[fpxlen+1]
dbinlen = o.bin.length - oo.bin.length
o.fields.each { |k, v| oo.fields[cvtarg128[k] || k] = [v[0]-dbinlen, v[1]] }
o.props.each { |k, v| oo.props[k] = v if k != :xmmx and k != :needpfx }
}
next if not add256[o.name]
addop_vex('v' + o.name, [varg[o.name], 256, pp, mm], o.bin[fpxlen], nil, *o.args.map { |oa| cvtarg256[oa] || oa }) { |oo|
oo.bin += [o.bin[fpxlen+1]] if o.bin[fpxlen+1]
dbinlen = o.bin.length - oo.bin.length
o.fields.each { |k, v| oo.fields[cvtarg256[k] || k] = [v[0]-dbinlen, v[1]] }
o.props.each { |k, v| oo.props[k] = v if k != :xmmx and k != :needpfx }
}
}
# sse promotion, special cases
addop_vex 'vpblendvb', [1, 128, 0x66, 0x0F3A, 0], 0x4C, :mrmxmm, :i4xmm
addop_vex 'vpsllw', [1, 128, 0x66, 0x0F], 0xF1, :mrmxmm
addop_vex('vpsllw', [0, 128, 0x66, 0x0F], 0x71, 6, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmxmm }
addop_vex 'vpslld', [1, 128, 0x66, 0x0F], 0xF2, :mrmxmm
addop_vex('vpslld', [0, 128, 0x66, 0x0F], 0x72, 6, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmxmm }
addop_vex 'vpsllq', [1, 128, 0x66, 0x0F], 0xF3, :mrmxmm
addop_vex('vpsllq', [0, 128, 0x66, 0x0F], 0x73, 6, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmxmm }
addop_vex('vpslldq',[0, 128, 0x66, 0x0F], 0x73, 7, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmxmm }
addop_vex 'vpsraw', [1, 128, 0x66, 0x0F], 0xE1, :mrmxmm
addop_vex('vpsraw', [0, 128, 0x66, 0x0F], 0x71, 4, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmxmm }
addop_vex 'vpsrad', [1, 128, 0x66, 0x0F], 0xE2, :mrmxmm
addop_vex('vpsrad', [0, 128, 0x66, 0x0F], 0x72, 4, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmxmm }
addop_vex 'vpsrlw', [1, 128, 0x66, 0x0F], 0xD1, :mrmxmm
addop_vex('vpsrlw', [0, 128, 0x66, 0x0F], 0x71, 2, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmxmm }
addop_vex 'vpsrld', [1, 128, 0x66, 0x0F], 0xD2, :mrmxmm
addop_vex('vpsrld', [0, 128, 0x66, 0x0F], 0x72, 2, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmxmm }
addop_vex 'vpsrlq', [1, 128, 0x66, 0x0F], 0xD3, :mrmxmm
addop_vex('vpsrlq', [0, 128, 0x66, 0x0F], 0x73, 2, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmxmm }
addop_vex('vpsrldq',[0, 128, 0x66, 0x0F], 0x73, 3, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmxmm }
# dst==mem => no vreg
addop_vex 'vmovhps', [1, 128, nil, 0x0F], 0x16, :mrmxmm, :modrmA
addop_vex('vmovhps', [nil, 128, nil, 0x0F], 0x17, :mrmxmm, :modrmA) { |o| o.args.reverse! }
addop_vex 'vmovlpd', [1, 128, 0x66, 0x0F], 0x12, :mrmxmm, :modrmA
addop_vex('vmovlpd', [nil, 128, 0x66, 0x0F], 0x13, :mrmxmm, :modrmA) { |o| o.args.reverse! }
addop_vex 'vmovlps', [1, 128, nil, 0x0F], 0x12, :mrmxmm, :modrmA
addop_vex('vmovlps', [nil, 128, nil, 0x0F], 0x13, :mrmxmm, :modrmA) { |o| o.args.reverse! }
addop_vex 'vbroadcastss', [nil, 128, 0x66, 0x0F38, 0], 0x18, :mrmxmm, :modrmA
addop_vex 'vbroadcastss', [nil, 256, 0x66, 0x0F38, 0], 0x18, :mrmymm, :modrmA
addop_vex 'vbroadcastsd', [nil, 256, 0x66, 0x0F38, 0], 0x19, :mrmymm, :modrmA
addop_vex 'vbroadcastf128', [nil, 256, 0x66, 0x0F38, 0], 0x1A, :mrmymm, :modrmA
# general-purpose register operations
addop_vex 'andn', [1, :vexvreg, 128, nil, 0x0F38], 0xF2, :mrm
addop_vex 'bextr', [2, :vexvreg, 128, nil, 0x0F38], 0xF7, :mrm
addop_vex 'blsi', [0, :vexvreg, 128, nil, 0x0F38], 0xF3, 3
addop_vex 'blsmsk', [0, :vexvreg, 128, nil, 0x0F38], 0xF3, 2
addop_vex 'blsr', [0, :vexvreg, 128, nil, 0x0F38], 0xF3, 1
addop_vex 'bzhi', [2, :vexvreg, 128, nil, 0x0F38], 0xF5, :mrm
addop('lzcnt', [0x0F, 0xBD], :mrm) { |o| o.props[:needpfx] = 0xF3 }
addop_vex 'mulx', [1, :vexvreg, 128, 0xF2, 0x0F38], 0xF6, :mrm
addop_vex 'pdep', [1, :vexvreg, 128, 0xF2, 0x0F38], 0xF5, :mrm
addop_vex 'pext', [1, :vexvreg, 128, 0xF3, 0x0F38], 0xF5, :mrm
addop_vex 'rorx', [nil, 128, 0xF2, 0x0F3A], 0xF0, :mrm, :u8
addop_vex 'sarx', [2, :vexvreg, 128, 0xF3, 0x0F38], 0xF7, :mrm
addop_vex 'shrx', [2, :vexvreg, 128, 0xF2, 0x0F38], 0xF7, :mrm
addop_vex 'shlx', [2, :vexvreg, 128, 0x66, 0x0F38], 0xF7, :mrm
addop('tzcnt', [0x0F, 0xBC], :mrm) { |o| o.props[:needpfx] = 0xF3 }
addop('invpcid', [0x0F, 0x38, 0x82], :mrm) { |o| o.props[:needpfx] = 0x66 }
addop 'rdrand', [0x0F, 0xC7], 6, :modrmR
addop 'rdseed', [0x0F, 0xC7], 7, :modrmR
addop('adcx', [0x0F, 0x38, 0xF6], :mrm) { |o| o.props[:needpfx] = 0x66 }
addop('adox', [0x0F, 0x38, 0xF6], :mrm) { |o| o.props[:needpfx] = 0xF3 }
# fp16
addop_vex 'vcvtph2ps', [nil, 128, 0x66, 0x0F38, 0], 0x13, :mrmxmm
addop_vex 'vcvtph2ps', [nil, 256, 0x66, 0x0F38, 0], 0x13, :mrmymm
addop_vex('vcvtps2ph', [nil, 128, 0x66, 0x0F3A, 0], 0x1D, :mrmxmm, :u8) { |o| o.args.reverse! }
addop_vex('vcvtps2ph', [nil, 256, 0x66, 0x0F3A, 0], 0x1D, :mrmymm, :u8) { |o| o.args.reverse! }
# TSE
addop 'xabort', [0xC6, 0xF8], nil, :i8 # may :stopexec
addop 'xbegin', [0xC7, 0xF8], nil, :i # may :setip: xabortreturns to $_(xbegin) + off
addop 'xend', [0x0F, 0x01, 0xD5]
addop 'xtest', [0x0F, 0x01, 0xD6]
# SMAP
addop 'clac', [0x0F, 0x01, 0xCA]
addop 'stac', [0x0F, 0x01, 0xCB]
end
def init_avx2_only
init_cpu_constants
add256 = {}
%w[packsswb pcmpgtb pcmpgtw pcmpgtd packuswb packssdw
pcmpeqb pcmpeqw pcmpeqd paddq pmullw psubusb psubusw
pminub pand paddusb paddusw pmaxub pandn pavgb pavgw
pmulhuw pmulhw psubsb psubsw pminsw por paddsb paddsw
pmaxsw pxor pmuludq pmaddwd psadbw
psubb psubw psubd psubq paddb paddw paddd
phaddw phaddsw phaddd phsubw phsubsw phsubd
pmaddubsw palignr pshufb pmulhrsw psignb psignw psignd
mpsadbw packusdw pblendw pcmpeqq
pmaxsb pmaxsd pmaxud pmaxuw pminsb pminsd pminud pminuw
pmuldq pmulld pcmpgtq punpcklbw punpcklwd punpckldq
punpckhbw punpckhwd punpckhdq punpcklqdq punpckhqdq
].each { |n| add256[n] = true }
varg = Hash.new(1)
%w[pabsb pabsw pabsd pmovmskb pshufd pshufhw pshuflw movntdqa
pmovsxbw pmovsxbd pmovsxbq pmovsxwd pmovsxwq pmovsxdq
pmovzxbw pmovzxbd pmovzxbq pmovzxwd pmovzxwq pmovzxdq
maskmovdqu].each { |n| add256[n] = true ; varg[n] = nil }
cvtarg256 = { :regmmx => :regymm, :modrmmmx => :modrmymm,
:regxmm => :regymm, :modrmxmm => :modrmymm }
# autopromote old sseX opcodes
@opcode_list.each { |o|
next if o.bin[0] != 0x0F or not add256[o.name]
mm = (o.bin[1] == 0x38 ? 0x0F38 : o.bin[1] == 0x3A ? 0x0F3A : 0x0F)
pp = o.props[:needpfx]
pp = 0x66 if o.props[:xmmx]
fpxlen = (mm == 0x0F ? 1 : 2)
addop_vex('v' + o.name, [varg[o.name], 256, pp, mm], o.bin[fpxlen], nil, *o.args.map { |oa| cvtarg256[oa] || oa }) { |oo|
oo.bin += [o.bin[fpxlen+1]] if o.bin[fpxlen+1]
dbinlen = o.bin.length - oo.bin.length
o.fields.each { |k, v| oo.fields[cvtarg256[k] || k] = [v[0]-dbinlen, v[1]] }
o.props.each { |k, v| oo.props[k] = v if k != :xmmx and k != :needpfx }
}
}
# promote special cases
addop_vex 'vpblendvb', [1, 256, 0x66, 0x0F3A, 0], 0x4C, :mrmymm, :i4ymm
addop_vex 'vpsllw', [1, 256, 0x66, 0x0F], 0xF1, :mrmymm
addop_vex('vpsllw', [0, 256, 0x66, 0x0F], 0x71, 6, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmymm }
addop_vex 'vpslld', [1, 256, 0x66, 0x0F], 0xF2, :mrmymm
addop_vex('vpslld', [0, 256, 0x66, 0x0F], 0x72, 6, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmymm }
addop_vex 'vpsllq', [1, 256, 0x66, 0x0F], 0xF3, :mrmymm
addop_vex('vpsllq', [0, 256, 0x66, 0x0F], 0x73, 6, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmymm }
addop_vex('vpslldq',[0, 256, 0x66, 0x0F], 0x73, 7, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmymm }
addop_vex 'vpsraw', [1, 256, 0x66, 0x0F], 0xE1, :mrmymm
addop_vex('vpsraw', [0, 256, 0x66, 0x0F], 0x71, 4, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmymm }
addop_vex 'vpsrad', [1, 256, 0x66, 0x0F], 0xE2, :mrmymm
addop_vex('vpsrad', [0, 256, 0x66, 0x0F], 0x72, 4, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmymm }
addop_vex 'vpsrlw', [1, 256, 0x66, 0x0F], 0xD1, :mrmymm
addop_vex('vpsrlw', [0, 256, 0x66, 0x0F], 0x71, 2, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmymm }
addop_vex 'vpsrld', [1, 256, 0x66, 0x0F], 0xD2, :mrmymm
addop_vex('vpsrld', [0, 256, 0x66, 0x0F], 0x72, 2, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmymm }
addop_vex 'vpsrlq', [1, 256, 0x66, 0x0F], 0xD3, :mrmymm
addop_vex('vpsrlq', [0, 256, 0x66, 0x0F], 0x73, 2, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmymm }
addop_vex('vpsrldq',[0, 256, 0x66, 0x0F], 0x73, 3, :u8, :modrmR) { |o| o.args[o.args.index(:modrm)] = :modrmymm }
addop_vex 'vbroadcastss', [nil, 128, 0x66, 0x0F38, 0], 0x18, :mrmxmm, :modrmR
addop_vex 'vbroadcastss', [nil, 256, 0x66, 0x0F38, 0], 0x18, :mrmymm, :modrmR
addop_vex 'vbroadcastsd', [nil, 256, 0x66, 0x0F38, 0], 0x19, :mrmymm, :modrmR
addop_vex 'vbroadcasti128', [nil, 256, 0x66, 0x0F38, 0], 0x5A, :mrmymm, :modrmA
addop_vex 'vpblendd', [1, 128, 0x66, 0x0F3A, 0], 0x02, :mrmxmm, :u8
addop_vex 'vpblendd', [1, 256, 0x66, 0x0F3A, 0], 0x02, :mrmymm, :u8
addop_vex 'vpbroadcastb', [nil, 128, 0x66, 0x0F38, 0], 0x78, :mrmxmm
addop_vex 'vpbroadcastb', [nil, 256, 0x66, 0x0F38, 0], 0x78, :mrmymm
addop_vex 'vpbroadcastw', [nil, 128, 0x66, 0x0F38, 0], 0x79, :mrmxmm
addop_vex 'vpbroadcastw', [nil, 256, 0x66, 0x0F38, 0], 0x79, :mrmymm
addop_vex 'vpbroadcastd', [nil, 128, 0x66, 0x0F38, 0], 0x58, :mrmxmm
addop_vex 'vpbroadcastd', [nil, 256, 0x66, 0x0F38, 0], 0x58, :mrmymm
addop_vex 'vpbroadcastq', [nil, 128, 0x66, 0x0F38, 0], 0x59, :mrmxmm
addop_vex 'vpbroadcastq', [nil, 256, 0x66, 0x0F38, 0], 0x59, :mrmymm
addop_vex 'vpermd', [1, 256, 0x66, 0x0F38, 0], 0x36, :mrmymm
addop_vex 'vpermpd', [nil, 256, 0x66, 0x0F3A, 1], 0x01, :mrmymm, :u8
addop_vex 'vpermps', [1, 256, 0x66, 0x0F38, 0], 0x16, :mrmymm, :u8
addop_vex 'vpermq', [nil, 256, 0x66, 0x0F3A, 1], 0x00, :mrmymm, :u8
addop_vex 'vperm2i128', [1, 256, 0x66, 0x0F3A, 0], 0x46, :mrmymm, :u8
addop_vex 'vextracti128', [nil, 256, 0x66, 0x0F3A, 0], 0x39, :mrmymm, :u8
addop_vex 'vinserti128', [1, 256, 0x66, 0x0F3A, 0], 0x38, :mrmymm, :u8
addop_vex 'vpmaskmovd', [1, 128, 0x66, 0x0F38, 0], 0x8C, :mrmxmm, :modrmA
addop_vex 'vpmaskmovd', [1, 256, 0x66, 0x0F38, 0], 0x8C, :mrmymm, :modrmA
addop_vex 'vpmaskmovq', [1, 128, 0x66, 0x0F38, 1], 0x8C, :mrmxmm, :modrmA
addop_vex 'vpmaskmovq', [1, 256, 0x66, 0x0F38, 1], 0x8C, :mrmymm, :modrmA
addop_vex('vpmaskmovd', [1, 128, 0x66, 0x0F38, 0], 0x8E, :mrmxmm, :modrmA) { |o| o.args.reverse! }
addop_vex('vpmaskmovd', [1, 256, 0x66, 0x0F38, 0], 0x8E, :mrmymm, :modrmA) { |o| o.args.reverse! }
addop_vex('vpmaskmovq', [1, 128, 0x66, 0x0F38, 1], 0x8E, :mrmxmm, :modrmA) { |o| o.args.reverse! }
addop_vex('vpmaskmovq', [1, 256, 0x66, 0x0F38, 1], 0x8E, :mrmymm, :modrmA) { |o| o.args.reverse! }
addop_vex 'vpsllvd', [1, 128, 0x66, 0x0F38, 0], 0x47, :mrmxmm
addop_vex 'vpsllvq', [1, 128, 0x66, 0x0F38, 1], 0x47, :mrmxmm
addop_vex 'vpsllvd', [1, 256, 0x66, 0x0F38, 0], 0x47, :mrmymm
addop_vex 'vpsllvq', [1, 256, 0x66, 0x0F38, 1], 0x47, :mrmymm
addop_vex 'vpsravd', [1, 128, 0x66, 0x0F38, 0], 0x46, :mrmxmm
addop_vex 'vpsravd', [1, 256, 0x66, 0x0F38, 0], 0x46, :mrmymm
addop_vex 'vpsrlvd', [1, 128, 0x66, 0x0F38, 0], 0x45, :mrmxmm
addop_vex 'vpsrlvq', [1, 128, 0x66, 0x0F38, 1], 0x45, :mrmxmm
addop_vex 'vpsrlvd', [1, 256, 0x66, 0x0F38, 0], 0x45, :mrmymm
addop_vex 'vpsrlvq', [1, 256, 0x66, 0x0F38, 1], 0x45, :mrmymm
addop_vex('vpgatherdd', [2, 128, 0x66, 0x0F38, 0], 0x90, :mrmxmm) { |o| o.props[:argsz] = 32 ; o.props[:mrmvex] = 128 }
addop_vex('vpgatherdd', [2, 256, 0x66, 0x0F38, 0], 0x90, :mrmymm) { |o| o.props[:argsz] = 32 ; o.props[:mrmvex] = 256 }
addop_vex('vpgatherdq', [2, 128, 0x66, 0x0F38, 1], 0x90, :mrmxmm) { |o| o.props[:argsz] = 64 ; o.props[:mrmvex] = 128 }
addop_vex('vpgatherdq', [2, 256, 0x66, 0x0F38, 1], 0x90, :mrmymm) { |o| o.props[:argsz] = 64 ; o.props[:mrmvex] = 256 }
addop_vex('vpgatherqd', [2, 128, 0x66, 0x0F38, 0], 0x91, :mrmxmm) { |o| o.props[:argsz] = 32 ; o.props[:mrmvex] = 128 }
addop_vex('vpgatherqd', [2, 256, 0x66, 0x0F38, 0], 0x91, :mrmymm) { |o| o.props[:argsz] = 32 ; o.props[:mrmvex] = 256 }
addop_vex('vpgatherqq', [2, 128, 0x66, 0x0F38, 1], 0x91, :mrmxmm) { |o| o.props[:argsz] = 64 ; o.props[:mrmvex] = 128 }
addop_vex('vpgatherqq', [2, 256, 0x66, 0x0F38, 1], 0x91, :mrmymm) { |o| o.props[:argsz] = 64 ; o.props[:mrmvex] = 256 }
addop_vex('vgatherdps', [2, 128, 0x66, 0x0F38, 0], 0x92, :mrmxmm) { |o| o.props[:argsz] = 32 ; o.props[:mrmvex] = 128 }
addop_vex('vgatherdps', [2, 256, 0x66, 0x0F38, 0], 0x92, :mrmymm) { |o| o.props[:argsz] = 32 ; o.props[:mrmvex] = 256 }
addop_vex('vgatherdpd', [2, 128, 0x66, 0x0F38, 1], 0x92, :mrmxmm) { |o| o.props[:argsz] = 64 ; o.props[:mrmvex] = 128 }
addop_vex('vgatherdpd', [2, 256, 0x66, 0x0F38, 1], 0x92, :mrmymm) { |o| o.props[:argsz] = 64 ; o.props[:mrmvex] = 256 }
addop_vex('vgatherqps', [2, 128, 0x66, 0x0F38, 0], 0x93, :mrmxmm) { |o| o.props[:argsz] = 32 ; o.props[:mrmvex] = 128 }
addop_vex('vgatherqps', [2, 256, 0x66, 0x0F38, 0], 0x93, :mrmymm) { |o| o.props[:argsz] = 32 ; o.props[:mrmvex] = 256 }
addop_vex('vgatherqpd', [2, 128, 0x66, 0x0F38, 1], 0x93, :mrmxmm) { |o| o.props[:argsz] = 64 ; o.props[:mrmvex] = 128 }
addop_vex('vgatherqpd', [2, 256, 0x66, 0x0F38, 1], 0x93, :mrmymm) { |o| o.props[:argsz] = 64 ; o.props[:mrmvex] = 256 }
end
def init_fma_only
init_cpu_constants
[['vfmaddsub', 'p', 0x86],
['vfmsubadd', 'p', 0x87],
['vfmadd', 'p', 0x88],
['vfmadd', 's', 0x89],
['vfmsub', 'p', 0x8A],
['vfmsub', 's', 0x8B],
['vfnmadd', 'p', 0x8C],
['vfnmadd', 's', 0x8D],
['vfnmsub', 'p', 0x8E],
['vfnmsub', 's', 0x8F]].each { |n1, n2, bin|
addop_vex n1 + '132' + n2 + 's', [1, 128, 0x66, 0x0F38, 0], bin | 0x10, :mrmxmm
addop_vex n1 + '132' + n2 + 's', [1, 256, 0x66, 0x0F38, 0], bin | 0x10, :mrmymm
addop_vex n1 + '132' + n2 + 'd', [1, 128, 0x66, 0x0F38, 1], bin | 0x10, :mrmxmm
addop_vex n1 + '132' + n2 + 'd', [1, 256, 0x66, 0x0F38, 1], bin | 0x10, :mrmymm
addop_vex n1 + '213' + n2 + 's', [1, 128, 0x66, 0x0F38, 0], bin | 0x20, :mrmxmm
addop_vex n1 + '213' + n2 + 's', [1, 256, 0x66, 0x0F38, 0], bin | 0x20, :mrmymm
addop_vex n1 + '213' + n2 + 'd', [1, 128, 0x66, 0x0F38, 1], bin | 0x20, :mrmxmm
addop_vex n1 + '213' + n2 + 'd', [1, 256, 0x66, 0x0F38, 1], bin | 0x20, :mrmymm
addop_vex n1 + '231' + n2 + 's', [1, 128, 0x66, 0x0F38, 0], bin | 0x30, :mrmxmm
addop_vex n1 + '231' + n2 + 's', [1, 256, 0x66, 0x0F38, 0], bin | 0x30, :mrmymm
addop_vex n1 + '231' + n2 + 'd', [1, 128, 0x66, 0x0F38, 1], bin | 0x30, :mrmxmm
addop_vex n1 + '231' + n2 + 'd', [1, 256, 0x66, 0x0F38, 1], bin | 0x30, :mrmymm
# pseudo-opcodes aliases (swap arg0/arg1)
addop_vex(n1 + '312' + n2 + 's', [1, 128, 0x66, 0x0F38, 0], bin | 0x10, :mrmxmm) { |o| o.args[0, 2] = o.args[0, 2].reverse }
addop_vex(n1 + '312' + n2 + 's', [1, 256, 0x66, 0x0F38, 0], bin | 0x10, :mrmymm) { |o| o.args[0, 2] = o.args[0, 2].reverse }
addop_vex(n1 + '312' + n2 + 'd', [1, 128, 0x66, 0x0F38, 1], bin | 0x10, :mrmxmm) { |o| o.args[0, 2] = o.args[0, 2].reverse }
addop_vex(n1 + '312' + n2 + 'd', [1, 256, 0x66, 0x0F38, 1], bin | 0x10, :mrmymm) { |o| o.args[0, 2] = o.args[0, 2].reverse }
addop_vex(n1 + '123' + n2 + 's', [1, 128, 0x66, 0x0F38, 0], bin | 0x20, :mrmxmm) { |o| o.args[0, 2] = o.args[0, 2].reverse }
addop_vex(n1 + '123' + n2 + 's', [1, 256, 0x66, 0x0F38, 0], bin | 0x20, :mrmymm) { |o| o.args[0, 2] = o.args[0, 2].reverse }
addop_vex(n1 + '123' + n2 + 'd', [1, 128, 0x66, 0x0F38, 1], bin | 0x20, :mrmxmm) { |o| o.args[0, 2] = o.args[0, 2].reverse }
addop_vex(n1 + '123' + n2 + 'd', [1, 256, 0x66, 0x0F38, 1], bin | 0x20, :mrmymm) { |o| o.args[0, 2] = o.args[0, 2].reverse }
addop_vex(n1 + '321' + n2 + 's', [1, 128, 0x66, 0x0F38, 0], bin | 0x30, :mrmxmm) { |o| o.args[0, 2] = o.args[0, 2].reverse }
addop_vex(n1 + '321' + n2 + 's', [1, 256, 0x66, 0x0F38, 0], bin | 0x30, :mrmymm) { |o| o.args[0, 2] = o.args[0, 2].reverse }
addop_vex(n1 + '321' + n2 + 'd', [1, 128, 0x66, 0x0F38, 1], bin | 0x30, :mrmxmm) { |o| o.args[0, 2] = o.args[0, 2].reverse }
addop_vex(n1 + '321' + n2 + 'd', [1, 256, 0x66, 0x0F38, 1], bin | 0x30, :mrmymm) { |o| o.args[0, 2] = o.args[0, 2].reverse }
}
end
#
# CPU family dependencies
#
def init_386_common
init_386_common_only
end
def init_386
init_386_common
init_386_only
end
def init_387
init_387_only
end
def init_486
init_386
init_387
init_486_only
end
def init_pentium
init_486
init_pentium_only
end
def init_3dnow
init_pentium
init_3dnow_only
end
def init_p6
init_pentium
init_p6_only
end
def init_sse
init_p6
init_sse_only
end
def init_sse2
init_sse
init_sse2_only
end
def init_sse3
init_sse2
init_sse3_only
end
def init_ssse3
init_sse3
init_ssse3_only
end
def init_sse41
init_ssse3
init_sse41_only
end
def init_sse42
init_sse41
init_sse42_only
end
def init_avx
init_sse42
init_avx_only
end
def init_avx2
init_avx
init_fma_only
init_avx2_only
end
def init_all
init_avx2
init_3dnow_only
init_vmx_only
init_aesni_only
end
alias init_latest init_all
#
# addop_* macros
#
def addop_macro1(name, num, *props)
addop name, [(num << 3) | 4], nil, {:w => [0, 0]}, :reg_eax, :i, *props
addop(name, [num << 3], :mrmw, {:d => [0, 1]}) { |o| o.args.reverse! }
addop name, [0x80], num, {:w => [0, 0], :s => [0, 1]}, :i, *props
end
def addop_macro2(name, num)
addop name, [0x0F, 0xBA], (4 | num), :u8
addop(name, [0x0F, 0xA3 | (num << 3)], :mrm) { |op| op.args.reverse! }
end
def addop_macro3(name, num)
addop name, [0xD0], num, {:w => [0, 0]}, :imm_val1
addop name, [0xD2], num, {:w => [0, 0]}, :reg_cl
addop name, [0xC0], num, {:w => [0, 0]}, :u8
end
def addop_macrotttn(name, bin, hint, *props, &blk)
[%w{o}, %w{no}, %w{b nae c}, %w{nb ae nc},
%w{z e}, %w{nz ne}, %w{be na}, %w{nbe a},
%w{s}, %w{ns}, %w{p pe}, %w{np po},
%w{l nge}, %w{nl ge}, %w{le ng}, %w{nle g}].each_with_index { |e, i|
b = bin.dup
if b[0] == 0x0F
b[1] |= i
else
b[0] |= i
end
e.each { |k| addop(name + k, b.dup, hint, *props, &blk) }
}
end
def addop_macrostr(name, bin, type)
# addop(name, bin.dup, {:w => [0, 0]}) { |o| o.props[type] = true } # TODO allow segment override
addop(name+'b', bin) { |o| o.props[:opsz] = 16 ; o.props[type] = true }
addop(name+'b', bin) { |o| o.props[:opsz] = 32 ; o.props[type] = true }
bin = bin.dup
bin[0] |= 1
addop(name+'w', bin) { |o| o.props[:opsz] = 16 ; o.props[type] = true }
addop(name+'d', bin) { |o| o.props[:opsz] = 32 ; o.props[type] = true }
end
def addop_macrofpu1(name, n)
addop(name, [0xD8, n<<3], :modrmA, :regfp0) { |o| o.props[:argsz] = 32 }
addop(name, [0xDC, n<<3], :modrmA, :regfp0) { |o| o.props[:argsz] = 64 }
addop(name, [0xD8, 0xC0|(n<<3)], :regfp, {:d => [0, 2]}) { |o| o.args.reverse! }
end
def addop_macrofpu2(name, n, n2=0)
addop(name, [0xDE|n2, n<<3], :modrmA, :regfp0) { |o| o.props[:argsz] = 16 }
addop(name, [0xDA|n2, n<<3], :modrmA, :regfp0) { |o| o.props[:argsz] = 32 }
end
def addop_macrofpu3(name, n)
addop_macrofpu2 name, n, 1
addop(name, [0xDF, 0x28|(n<<3)], :modrmA, :regfp0) { |o| o.props[:argsz] = 64 }
end
def addop_macrogg(ggrng, name, bin, *args, &blk)
ggoff = 1
ggoff = 2 if bin[1] == 0x38 or bin[1] == 0x3A
ggrng.each { |gg|
bindup = bin.dup
bindup[ggoff] |= gg
sfx = %w(b w d q)[gg]
addop name+sfx, bindup, *args, &blk
}
end
def addop_macrossps(name, bin, hint, *a)
addop name, bin.dup, hint, *a
addop(name.sub(/ps$/, 'ss'), bin.dup, hint, *a) { |o| o.props[:needpfx] = 0xF3 }
end
def addop_macrosdpd(name, bin, hint, *a)
addop(name, bin.dup, hint, *a) { |o| o.props[:needpfx] = 0x66 }
addop(name.sub(/pd$/, 'sd'), bin.dup, hint, *a) { |o| o.props[:needpfx] = 0xF2 }
end
# special ret (iret/retf), that still default to 32b mode in x64
def addop_macroret(name, bin, *args)
addop(name + '.i32', bin.dup, nil, :stopexec, :setip, *args) { |o| o.props[:opsz] = 32 }
addop(name + '.i16', bin.dup, nil, :stopexec, :setip, *args) { |o| o.props[:opsz] = 16 } if name != 'sysret'
addop(name, bin.dup, nil, :stopexec, :setip, *args) { |o| o.props[:opsz] = @size }
end
# add an AVX instruction needing a VEX prefix (c4h/c5h)
# the prefix is hardcoded
def addop_vex(name, vexspec, bin, *args)
argnr = vexspec.shift
argt = vexspec.shift if argnr and vexspec.first.kind_of?(::Symbol)
l = vexspec.shift
pfx = vexspec.shift
of = vexspec.shift
w = vexspec.shift
argt ||= (l == 128 ? :vexvxmm : :vexvymm)
lpp = ((l >> 8) << 2) | [nil, 0x66, 0xF3, 0xF2].index(pfx)
mmmmm = [nil, 0x0F, 0x0F38, 0x0F3A].index(of)
c4bin = [0xC4, mmmmm, lpp, bin]
c4bin[1] |= 1 << 7 if @size != 64
c4bin[1] |= 1 << 6 if @size != 64
c4bin[2] |= 1 << 7 if w == 1
c4bin[2] |= 0xF << 3 if not argnr
addop(name, c4bin, *args) { |o|
o.args.insert(argnr, argt) if argnr
o.fields[:vex_r] = [1, 7] if @size == 64
o.fields[:vex_x] = [1, 6] if @size == 64
o.fields[:vex_b] = [1, 5]
o.fields[:vex_w] = [2, 7] if not w
o.fields[:vex_vvvv] = [2, 3] if argnr
yield o if block_given?
}
return if w == 1 or mmmmm != 1
c5bin = [0xC5, lpp, bin]
c5bin[1] |= 1 << 7 if @size != 64
c5bin[1] |= 0xF << 3 if not argnr
addop(name, c5bin, *args) { |o|
o.args.insert(argnr, argt) if argnr
o.fields[:vex_r] = [1, 7] if @size == 64
o.fields[:vex_vvvv] = [1, 3] if argnr
yield o if block_given?
}
end
# helper function: creates a new Opcode based on the arguments, eventually
# yields it for further customisation, and append it to the instruction set
# is responsible of the creation of disambiguating opcodes if necessary (:s flag hardcoding)
def addop(name, bin, hint=nil, *argprops)
fields = (argprops.first.kind_of?(Hash) ? argprops.shift : {})
op = Opcode.new name, bin
op.fields.replace fields
case hint
when nil
when :mrm, :mrmw, :mrmA
op.fields[:reg] = [bin.length, 3]
op.fields[:modrm] = [bin.length, 0]
op.fields[:w] = [bin.length - 1, 0] if hint == :mrmw
argprops.unshift :reg, :modrm
argprops << :modrmA if hint == :mrmA
op.bin << 0
when :reg
op.fields[:reg] = [bin.length-1, 0]
argprops.unshift :reg
when :regfp
op.fields[:regfp] = [bin.length-1, 0]
argprops.unshift :regfp, :regfp0
when :modrmA
op.fields[:modrm] = [bin.length-1, 0]
argprops << :modrm << :modrmA
when Integer # mod/m, reg == opcode extension = hint
op.fields[:modrm] = [bin.length, 0]
op.bin << (hint << 3)
argprops.unshift :modrm
when :mrmmmx
op.fields[:regmmx] = [bin.length, 3]
op.fields[:modrm] = [bin.length, 0]
bin << 0
argprops.unshift :regmmx, :modrmmmx
when :mrmxmm
op.fields[:regxmm] = [bin.length, 3]
op.fields[:modrm] = [bin.length, 0]
bin << 0
argprops.unshift :regxmm, :modrmxmm
when :mrmymm
op.fields[:regymm] = [bin.length, 3]
op.fields[:modrm] = [bin.length, 0]
bin << 0
argprops.unshift :regymm, :modrmymm
else
raise SyntaxError, "invalid hint #{hint.inspect} for #{name}"
end
argprops.each { |a|
op.props[a] = true if @valid_props[a]
op.args << a if @valid_args[a]
}
yield op if block_given?
if $DEBUG
argprops -= @valid_props.keys + @valid_args.keys
raise "Invalid opcode definition: #{name}: unknown #{argprops.inspect}" unless argprops.empty?
argprops = (op.props.keys - @valid_props.keys) + (op.args - @valid_args.keys) + (op.fields.keys - @fields_mask.keys)
raise "Invalid opcode customisation: #{name}: #{argprops.inspect}" unless argprops.empty?
end
addop_post(op)
end
# this recursive method is in charge of Opcode duplication (eg to hardcode some flag)
def addop_post(op)
if df = op.fields.delete(:d)
# hardcode the bit
dop = op.dup
addop_post dop
op.bin[df[0]] |= 1 << df[1]
op.args.reverse!
addop_post op
return
elsif wf = op.fields.delete(:w)
# hardcode the bit
dop = op.dup
dop.props[:argsz] = 8
# 64-bit w=0 s=1 => UD
dop.fields.delete(:s) if @size == 64
addop_post dop
op.bin[wf[0]] |= 1 << wf[1]
addop_post op
return
elsif sf = op.fields.delete(:s)
# add explicit choice versions, with lower precedence (so that disassembling will return the general version)
# eg "jmp", "jmp.i8", "jmp.i"
# also hardcode the bit
op32 = op
addop_post op32
op8 = op.dup
op8.bin[sf[0]] |= 1 << sf[1]
op8.args.map! { |arg| arg == :i ? :i8 : arg }
addop_post op8
op32 = op32.dup
op32.name << '.i'
addop_post op32
op8 = op8.dup
op8.name << '.i8'
addop_post op8
return
elsif op.args.first == :regfp0
dop = op.dup
dop.args.delete :regfp0
addop_post dop
end
if op.props[:needpfx]
@opcode_list.unshift op
else
@opcode_list << op
end
if (op.args == [:i] or op.args == [:farptr] or op.name == 'ret') and op.name !~ /\.i/
# define opsz-override version for ambiguous opcodes
op16 = op.dup
op16.name << '.i16'
op16.props[:opsz] = 16
@opcode_list << op16
op32 = op.dup
op32.name << '.i32'
op32.props[:opsz] = 32
@opcode_list << op32
elsif op.props[:strop] or op.props[:stropz] or op.args.include? :mrm_imm or
op.args.include? :modrm or op.name =~ /loop|xlat/
# define adsz-override version for ambiguous opcodes (TODO allow movsd edi / movsd di syntax)
# XXX loop pfx 67 = eip+cx, 66 = ip+ecx
op16 = op.dup
op16.name << '.a16'
op16.props[:adsz] = 16
@opcode_list << op16
op32 = op.dup
op32.name << '.a32'
op32.props[:adsz] = 32
@opcode_list << op32
end
end
end
end
-14
View File
@@ -1,14 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
class Metasm::MIPS < Metasm::CPU
end
require 'metasm/main'
require 'metasm/cpu/mips/parse'
require 'metasm/cpu/mips/encode'
require 'metasm/cpu/mips/decode'
require 'metasm/cpu/mips/render'
require 'metasm/cpu/mips/debug'
-42
View File
@@ -1,42 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
module Metasm
class MIPS
def dbg_register_pc
@dbg_register_pc ||= :pc
end
def dbg_register_flags
@dbg_register_flags ||= :flags
end
def dbg_register_list
@dbg_register_list ||= %w[z0 at v0 v1 a0 a1 a2 a3
t0 t1 t2 t3 t4 t5 t6 t7
s0 s1 s2 s3 s4 s5 s6 s7
t8 t9 k0 k1 gp sp fp ra
sr mullo mulhi badva cause pc].map { |r| r.to_sym }
end
def dbg_flag_list
@dbg_flag_list ||= []
end
def dbg_register_size
@dbg_register_size ||= Hash.new(@size)
end
def dbg_need_stepover(dbg, addr, di)
di and di.opcode.props[:saveip]
end
def dbg_end_stepout(dbg, addr, di)
di and di.opcode.name == 'foobar' # TODO
end
end
end
-55
View File
@@ -1,55 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/ppc/opcodes'
require 'metasm/parse'
module Metasm
class PowerPC
# TODO
def parse_arg_valid?(op, sym, arg)
case sym
when :ra, :rb, :rs, :rt; arg.kind_of?(GPR)
when :fra, :frb, :frc, :frs, :frt; arg.kind_of?(FPR)
when :ra_i16, :ra_i16s, :ra_i16q; arg.kind_of?(Memref)
when :bd, :d, :ds, :dq, :si, :ui, :li, :sh, :mb, :me, :mb_, :me_, :u; arg.kind_of?(Expression)
when :ba, :bf, :bfa, :bt; arg.kind_of?(CR)
when :ign_bo_zzz, :ign_bo_z, :ign_bo_at, :ign_bo_at2, :aa, :lk, :oe, :rc, :l; # ?
when :bb, :bh, :flm, :fxm, :l_, :l__, :lev, :nb, :sh_, :spr, :sr, :tbr, :th, :to
# TODO
else raise "internal error: mips arg #{sym.inspect}"
end
end
def parse_argument(pgm)
pgm.skip_space
return if not tok = pgm.readtok
if tok.type == :string
return GPR.new(GPR.s_to_i[tok.raw]) if GPR.s_to_i[tok.raw]
return SPR.new(SPR.s_to_i[tok.raw]) if SPR.s_to_i[tok.raw]
return FPR.new(FPR.s_to_i[tok.raw]) if FPR.s_to_i[tok.raw]
return CR.new(CR.s_to_i[tok.raw]) if CR.s_to_i[tok.raw]
return MSR.new if tok.raw == 'msr'
end
pgm.unreadtok tok
arg = Expression.parse pgm
pgm.skip_space
# check memory indirection: 'off(base reg)' # XXX scaled index ?
if arg and pgm.nexttok and pgm.nexttok.type == :punct and pgm.nexttok.raw == '('
pgm.readtok
pgm.skip_space_eol
ntok = pgm.readtok
raise tok, "Invalid base #{ntok}" unless ntok and ntok.type == :string and GPR.s_to_i[ntok.raw]
base = GPR.new GPR.s_to_i[ntok.raw]
pgm.skip_space_eol
ntok = pgm.readtok
raise tok, "Invalid memory reference, ')' expected" if not ntok or ntok.type != :punct or ntok.raw != ')'
arg = Memref.new base, arg
end
arg
end
end
end
-8
View File
@@ -1,8 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2010 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
require 'metasm/cpu/python/decode'
-136
View File
@@ -1,136 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2010 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/python/opcodes'
require 'metasm/decode'
module Metasm
class Python
def build_bin_lookaside
opcode_list.inject({}) { |la, op| la.update op.bin => op }
end
def decode_findopcode(edata)
di = DecodedInstruction.new(self)
byte = edata.decode_imm(:u8, :little)
di if di.opcode = @bin_lookaside[byte]
end
def decode_instr_op(edata, di)
di.bin_length = 1
di.instruction.opname = di.opcode.name
di.opcode.args.each { |a|
case a
when :cmp
di.bin_length += 2
v = edata.decode_imm(:i16, @endianness)
di.instruction.args << (CMP_OP[v] || Expression[v])
when :i16
di.bin_length += 2
di.instruction.args << Expression[edata.decode_imm(:i16, @endianness)]
when :u8
di.bin_length += 1
di.instruction.args << Expression[edata.decode_imm(:u8, @endianness)]
else
raise "unsupported arg #{a.inspect}"
end
}
return if edata.ptr > edata.length
di
end
def decode_instr_interpret(di, addr)
case di.opcode.name
when 'LOAD_CONST'
if c = prog_code(addr)
cst = c[:consts][di.instruction.args.first.reduce]
if cst.kind_of? Hash and cst[:type] == :code
di.add_comment "lambda #{Expression[cst[:fileoff]]}"
else
di.add_comment cst.inspect
end
end
when 'LOAD_NAME', 'LOAD_ATTR', 'LOAD_GLOBAL', 'STORE_NAME', 'IMPORT_NAME', 'LOAD_FAST'
if c = prog_code(addr)
di.add_comment c[:names][di.instruction.args.first.reduce].inspect
end
end
di
end
def backtrace_binding
@backtrace_binding ||= init_backtrace_binding
end
def init_backtrace_binding
@backtrace_binding ||= {}
opcode_list.each { |op|
binding = case op
when 'nop'; lambda { |*a| {} }
end
@backtrace_binding[op] ||= binding if binding
}
@backtrace_binding
end
def get_backtrace_binding(di)
a = di.instruction.args.map { |arg|
case arg
when Var; arg.symbolic
else arg
end
}
if binding = backtrace_binding[di.opcode.basename]
binding[di, *a]
else
puts "unhandled instruction to backtrace: #{di}" if $VERBOSE
{ :incomplete_binding => Expression[1] }
end
end
def get_xrefs_x(dasm, di)
return [] if not di.opcode.props[:setip]
arg = case di.opcode.name
when 'JUMP_FORWARD', 'FOR_ITER'
# relative offset
di.instruction.args.last.reduce + di.next_addr
when 'CALL_FUNCTION_VAR'
'lol'
when /CALL/
:unknown
else
# absolute offset from :code start
off = di.instruction.args.last.reduce
if c = prog_code(di)
off += c[:fileoff]
end
off
end
[Expression[(arg.kind_of?(Var) ? arg.symbolic : arg)]]
end
def prog_code(addr)
addr = addr.address if addr.kind_of? DecodedInstruction
@last_prog_code ||= nil
return @last_prog_code if @last_prog_code and @last_prog_code[:fileoff] <= addr and @last_prog_code[:fileoff] + @last_prog_code[:code].length > addr
@last_prog_code = @program.code_at_off(addr) if @program
end
def backtrace_is_function_return(expr, di=nil)
#Expression[expr].reduce == Expression['wtf']
end
end
end
-36
View File
@@ -1,36 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2010 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
module Metasm
class Python < CPU
def initialize(prog = nil)
super()
@program = prog
@endianness = (prog.respond_to?(:endianness) ? prog.endianness : :little)
@size = (prog.respond_to?(:size) ? prog.size : 32)
end
class Var
include Renderable
attr_accessor :i
def initialize(i); @i = i end
def ==(o)
o.class == self.class and o.i == i
end
def symbolic; "var_#{@i}".to_sym end
def render
["var_#@i"]
end
end
end
end
-180
View File
@@ -1,180 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2010 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/python/main'
module Metasm
class Python
CMP_OP = %w[< <= == != > >= in not_in is is_not exch]
def addop(name, bin, *args)
o = Opcode.new(name)
o.bin = bin
args.each { |a|
o.args << a if @valid_args[a]
o.props[a] = true if @valid_props[a]
}
o.args << :i16 if o.bin >= 90 and o.props.empty? # HAVE_ARGUMENT
@opcode_list << o
end
def init_opcode_list
@opcode_list = []
@valid_args[:u8] = true
@valid_args[:i16] = true
@valid_args[:cmp] = true
addop 'STOP_CODE', 0, :stopexec
addop 'POP_TOP', 1
addop 'ROT_TWO', 2
addop 'ROT_THREE', 3
addop 'DUP_TOP', 4
addop 'ROT_FOUR', 5
addop 'NOP', 9
addop 'UNARY_POSITIVE', 10
addop 'UNARY_NEGATIVE', 11
addop 'UNARY_NOT', 12
addop 'UNARY_CONVERT', 13
addop 'UNARY_INVERT', 15
addop 'BINARY_POWER', 19
addop 'BINARY_MULTIPLY', 20
addop 'BINARY_DIVIDE', 21
addop 'BINARY_MODULO', 22
addop 'BINARY_ADD', 23
addop 'BINARY_SUBTRACT', 24
addop 'BINARY_SUBSCR', 25
addop 'BINARY_FLOOR_DIVIDE', 26
addop 'BINARY_TRUE_DIVIDE', 27
addop 'INPLACE_FLOOR_DIVIDE', 28
addop 'INPLACE_TRUE_DIVIDE', 29
addop 'SLICE', 30
addop 'SLICE_1', 31
addop 'SLICE_2', 32
addop 'SLICE_3', 33
addop 'STORE_SLICE', 40
addop 'STORE_SLICE_1', 41
addop 'STORE_SLICE_2', 42
addop 'STORE_SLICE_3', 43
addop 'DELETE_SLICE', 50
addop 'DELETE_SLICE_1', 51
addop 'DELETE_SLICE_2', 52
addop 'DELETE_SLICE_3', 53
addop 'STORE_MAP', 54
addop 'INPLACE_ADD', 55
addop 'INPLACE_SUBTRACT', 56
addop 'INPLACE_MULTIPLY', 57
addop 'INPLACE_DIVIDE', 58
addop 'INPLACE_MODULO', 59
addop 'STORE_SUBSCR', 60
addop 'DELETE_SUBSCR', 61
addop 'BINARY_LSHIFT', 62
addop 'BINARY_RSHIFT', 63
addop 'BINARY_AND', 64
addop 'BINARY_XOR', 65
addop 'BINARY_OR', 66
addop 'INPLACE_POWER', 67
addop 'GET_ITER', 68
addop 'PRINT_EXPR', 70
addop 'PRINT_ITEM', 71
addop 'PRINT_NEWLINE', 72
addop 'PRINT_ITEM_TO', 73
addop 'PRINT_NEWLINE_TO', 74
addop 'INPLACE_LSHIFT', 75
addop 'INPLACE_RSHIFT', 76
addop 'INPLACE_AND', 77
addop 'INPLACE_XOR', 78
addop 'INPLACE_OR', 79
addop 'BREAK_LOOP', 80
addop 'WITH_CLEANUP', 81
addop 'LOAD_LOCALS', 82
addop 'RETURN_VALUE', 83
addop 'IMPORT_STAR', 84
addop 'EXEC_STMT', 85
addop 'YIELD_VALUE', 86
addop 'POP_BLOCK', 87
addop 'END_FINALLY', 88
addop 'BUILD_CLASS', 89
#addop 'HAVE_ARGUMENT', 90 #/* Opcodes from here have an argument: */
addop 'STORE_NAME', 90 #/* Index in name list */
addop 'DELETE_NAME', 91 #/* "" */
addop 'UNPACK_SEQUENCE', 92 #/* Number of sequence items */
addop 'FOR_ITER', 93, :setip
addop 'LIST_APPEND', 94
addop 'STORE_ATTR', 95 #/* Index in name list */
addop 'DELETE_ATTR', 96 #/* "" */
addop 'STORE_GLOBAL', 97 #/* "" */
addop 'DELETE_GLOBAL', 98 #/* "" */
addop 'DUP_TOPX', 99 #/* number of items to duplicate */
addop 'LOAD_CONST', 100 #/* Index in const list */
addop 'LOAD_NAME', 101 #/* Index in name list */
addop 'BUILD_TUPLE', 102 #/* Number of tuple items */
addop 'BUILD_LIST', 103 #/* Number of list items */
addop 'BUILD_SET', 104 #/* Number of set items */
addop 'BUILD_MAP', 105 #/* Always zero for now */
addop 'LOAD_ATTR', 106 #/* Index in name list */
addop 'COMPARE_OP', 107, :cmp #/* Comparison operator */
addop 'IMPORT_NAME', 108 #/* Index in name list */
addop 'IMPORT_FROM', 109 #/* Index in name list */
addop 'JUMP_FORWARD', 110, :setip, :stopexec #/* Number of bytes to skip */
addop 'JUMP_IF_FALSE_OR_POP', 111, :setip #/* Target byte offset from beginning of code */
addop 'JUMP_IF_TRUE_OR_POP', 112, :setip #/* "" */
addop 'JUMP_ABSOLUTE', 113, :setip, :stopexec #/* "" */
addop 'POP_JUMP_IF_FALSE', 114, :setip #/* "" */
addop 'POP_JUMP_IF_TRUE', 115, :setip #/* "" */
addop 'LOAD_GLOBAL', 116 #/* Index in name list */
addop 'CONTINUE_LOOP', 119 #/* Start of loop (absolute) */
addop 'SETUP_LOOP', 120 #/* Target address (relative) */
addop 'SETUP_EXCEPT', 121 #/* "" */
addop 'SETUP_FINALLY', 122 #/* "" */
addop 'LOAD_FAST', 124 #/* Local variable number */
addop 'STORE_FAST', 125 #/* Local variable number */
addop 'DELETE_FAST', 126 #/* Local variable number */
addop 'RAISE_VARARGS', 130 #/* Number of raise arguments (1, 2 or 3) */
#/* CALL_FUNCTION_XXX opcodes defined below depend on this definition */
addop 'CALL_FUNCTION', 131, :u8, :u8, :setip #/* #args + (#kwargs<<8) */
addop 'MAKE_FUNCTION', 132 #/* #defaults */
addop 'BUILD_SLICE', 133 #/* Number of items */
addop 'MAKE_CLOSURE', 134 #/* #free vars */
addop 'LOAD_CLOSURE', 135 #/* Load free variable from closure */
addop 'LOAD_DEREF', 136 #/* Load and dereference from closure cell */
addop 'STORE_DEREF', 137 #/* Store into cell */
#/* The next 3 opcodes must be contiguous and satisfy (CALL_FUNCTION_VAR - CALL_FUNCTION) & 3 == 1 */
addop 'CALL_FUNCTION_VAR', 140, :u8, :u8, :setip #/* #args + (#kwargs<<8) */
addop 'CALL_FUNCTION_KW', 141, :u8, :u8, :setip #/* #args + (#kwargs<<8) */
addop 'CALL_FUNCTION_VAR_KW', 142, :u8, :u8, :setip #/* #args + (#kwargs<<8) */
addop 'SETUP_WITH', 143
#/* Support for opargs more than 16 bits long */
addop 'EXTENDED_ARG', 145
addop 'SET_ADD', 146
addop 'MAP_ADD', 147
end
end
end
-15
View File
@@ -1,15 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
class Metasm::X86_64 < Metasm::Ia32
end
require 'metasm/main'
require 'metasm/cpu/x86_64/parse'
require 'metasm/cpu/x86_64/encode'
require 'metasm/cpu/x86_64/decode'
require 'metasm/cpu/x86_64/render'
require 'metasm/cpu/x86_64/debug'
require 'metasm/cpu/x86_64/compile_c'
-136
View File
@@ -1,136 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/x86_64/main'
require 'metasm/cpu/ia32/opcodes'
module Metasm
class X86_64
def init_cpu_constants
super()
[:i32, :u32, :i64, :u64].each { |a| @valid_args[a] = true }
end
def init_386_common_only
super()
# :imm64 => accept a real int64 as :i argument
# :auto64 => ignore rex_w, always 64-bit op
# :op32no64 => if write to a 32-bit reg, dont zero the top 32-bits of dest
[:imm64, :auto64, :op32no64].each { |a| @valid_props[a] = true }
@opcode_list.delete_if { |o| o.bin[0].to_i & 0xf0 == 0x40 } # now REX prefix
@opcode_list.each { |o|
o.props[:imm64] = true if o.bin == [0xB8] # mov reg, <true imm64>
o.props[:auto64] = true if o.name =~ /^(j.*|loop.*|call|enter|leave|push|pop|ret)$/
}
addop 'movsxd', [0x63], :mrm
addop('cdqe', [0x98]) { |o| o.props[:opsz] = 64 }
addop('cqo', [0x99]) { |o| o.props[:opsz] = 64 }
end
# all x86_64 cpu understand <= sse2 instrs
def init_x8664_only
init_386_common_only
init_386_only
init_387_only
init_486_only
init_pentium_only
init_p6_only
init_sse_only
init_sse2_only
@opcode_list.delete_if { |o|
o.args.include?(:seg2) or
o.args.include?(:seg2A) or
o.args.include?(:farptr) or
%w[aaa aad aam aas bound daa das into jcxz jecxz
lds les loadall arpl pusha pushad popa
popad].include?(o.name.split('.')[0])
# split needed for lds.a32
}
@opcode_list.each { |o|
o.props[:auto64] = true if o.name =~ /^(enter|leave|[sl]gdt|[sl]idt|[sl]ldt|[sl]tr|push|pop|syscall)$/
}
addop('cmpxchg16b', [0x0F, 0xC7], 1) { |o| o.props[:opsz] = 64 ; o.props[:argsz] = 128 }
addop('iretq', [0xCF], nil, :stopexec, :setip) { |o| o.props[:opsz] = 64 } ; opcode_list.unshift opcode_list.pop
addop 'swapgs', [0x0F, 0x01, 0xF8]
addop('movq', [0x0F, 0x6E], :mrmmmx, {:d => [1, 4]}) { |o| o.args = [:modrm, :regmmx] ; o.props[:opsz] = o.props[:argsz] = 64 }
addop('movq', [0x0F, 0x6E], :mrmxmm, {:d => [1, 4]}) { |o| o.args = [:modrm, :regxmm] ; o.props[:opsz] = o.props[:argsz] = 64 ; o.props[:needpfx] = 0x66 }
addop('jcxz', [0xE3], nil, :setip, :i8) { |o| o.props[:adsz] = 32 } # actually 16 (cx), but x64 in general says pfx 0x67 => adsz = 32
addop('jrcxz', [0xE3], nil, :setip, :i8) { |o| o.props[:adsz] = 64 }
end
def init_sse3
init_x8664_only
init_sse3_only
end
def init_sse41_only
super()
addop('pextrq', [0x0F, 0x3A, 0x16], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66; o.args[o.args.index(:modrmxmm)] = :modrm; o.props[:opsz] = o.props[:argsz] = 64 }
addop('pinsrq', [0x0F, 0x3A, 0x22], :mrmxmm, :u8) { |o| o.props[:needpfx] = 0x66; o.args[o.args.index(:modrmxmm)] = :modrm; o.props[:opsz] = o.props[:argsz] = 64 }
end
def init_avx_only
super()
addop('rdfsbase', [0x0F, 0xAE], 0, :modrmR) { |o| o.props[:needpfx] = 0xF3 }
addop('rdgsbase', [0x0F, 0xAE], 1, :modrmR) { |o| o.props[:needpfx] = 0xF3 }
addop('wrfsbase', [0x0F, 0xAE], 2, :modrmR) { |o| o.props[:needpfx] = 0xF3 }
addop('wrgsbase', [0x0F, 0xAE], 3, :modrmR) { |o| o.props[:needpfx] = 0xF3 }
end
def addop_macrostr(name, bin, type)
super(name, bin, type)
bin = bin.dup
bin[0] |= 1
addop(name+'q', bin) { |o| o.props[:opsz] = 64 ; o.props[type] = true }
end
def addop_macroret(name, bin, *args)
addop(name + '.i64', bin, nil, :stopexec, :setip, *args) { |o| o.props[:opsz] = 64 }
super(name, bin, *args)
end
def addop_post(op)
if op.fields[:d] or op.fields[:w] or op.fields[:s] or op.args.first == :regfp0
return super(op)
end
if op.props[:needpfx]
@opcode_list.unshift op
else
@opcode_list << op
end
if op.args == [:i] or op.name == 'ret'
# define opsz-override version for ambiguous opcodes
op16 = op.dup
op16.name << '.i16'
op16.props[:opsz] = 16
@opcode_list << op16
# push call ret jz can't 32bit
op64 = op.dup
op64.name << '.i64'
op64.props[:opsz] = 64
@opcode_list << op64
elsif op.props[:strop] or op.props[:stropz] or op.args.include? :mrm_imm or
op.args.include? :modrm or op.name =~ /loop|xlat/
# define adsz-override version for ambiguous opcodes (movsq)
# XXX loop pfx 67 = rip+ecx, 66/rex ignored
op32 = op.dup
op32.name << '.a32'
op32.props[:adsz] = 32
@opcode_list << op32
op64 = op.dup
op64.name << '.a64'
op64.props[:adsz] = 64
@opcode_list << op64
end
end
end
end
-35
View File
@@ -1,35 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/x86_64/opcodes'
require 'metasm/render'
module Metasm
class X86_64
def gui_hilight_word_regexp_init
ret = {}
%w[a b c d].each { |r|
ret["#{r}l"] = "[re]?#{r}x|#{r}l"
ret["#{r}h"] = "[re]?#{r}x|#{r}h"
ret["#{r}x"] = ret["e#{r}x"] = ret["r#{r}x"] = "[re]?#{r}x|#{r}[hl]"
}
%w[sp bp si di].each { |r|
ret["#{r}l"] = ret[r] = ret["e#{r}"] = ret["r#{r}"] = "[re]?#{r}|#{r}l"
}
(8..15).each { |i|
r = "r#{i}"
ret[r+'b'] = ret[r+'w'] = ret[r+'d'] = ret[r] = "#{r}[bwd]?"
}
ret['eip'] = ret['rip'] = '[re]ip'
ret
end
end
end
-313
View File
@@ -1,313 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/z80/opcodes'
require 'metasm/decode'
module Metasm
class Z80
def build_opcode_bin_mask(op)
# bit = 0 if can be mutated by an field value, 1 if fixed by opcode
op.bin_mask = Array.new(op.bin.length, 0)
op.fields.each { |f, (oct, off)|
op.bin_mask[oct] |= (@fields_mask[f] << off)
}
op.bin_mask.map! { |v| 255 ^ v }
end
def build_bin_lookaside
# sets up a hash byte value => list of opcodes that may match
# opcode.bin_mask is built here
lookaside = Array.new(256) { [] }
opcode_list.each { |op|
build_opcode_bin_mask op
b = op.bin[0]
msk = op.bin_mask[0]
next @unknown_opcode = op if not b
for i in b..(b | (255^msk))
lookaside[i] << op if i & msk == b & msk
end
}
lookaside
end
def decode_prefix(instr, byte)
case byte
when 0xDD; instr.prefix = 0xDD
when 0xFD; instr.prefix = 0xFD
# implicit 'else return false'
end
end
# tries to find the opcode encoded at edata.ptr
# if no match, tries to match a prefix (update di.instruction.prefix)
# on match, edata.ptr points to the first byte of the opcode (after prefixes)
def decode_findopcode(edata)
di = DecodedInstruction.new self
while edata.ptr < edata.data.length
byte = edata.data[edata.ptr]
byte = byte.unpack('C').first if byte.kind_of?(::String)
return di if di.opcode = @bin_lookaside[byte].find { |op|
# fetch the relevant bytes from edata
bseq = edata.data[edata.ptr, op.bin.length].unpack('C*')
# check against full opcode mask
op.bin.zip(bseq, op.bin_mask).all? { |b1, b2, m| b2 and ((b1 & m) == (b2 & m)) }
}
if decode_prefix(di.instruction, edata.get_byte)
nb = edata.data[edata.ptr]
nb = nb.unpack('C').first if nb.kind_of?(::String)
case nb
when 0xCB
# DD CB <disp8> <opcode_pfxCB> [<args>]
di.instruction.prefix |= edata.get_byte << 8
di.bin_length += 2
opc = edata.data[edata.ptr+1]
opc = opc.unpack('C').first if opc.kind_of?(::String)
bseq = [0xCB, opc]
# XXX in decode_instr_op, byte[0] is the immediate displacement instead of cb
return di if di.opcode = @bin_lookaside[nb].find { |op|
op.bin.zip(bseq, op.bin_mask).all? { |b1, b2, m| b2 and ((b1 & m) == (b2 & m)) }
}
when 0xED
di.instruction.prefix = nil
end
else
di.opcode = @unknown_opcode
return di
end
di.bin_length += 1
end
end
def decode_instr_op(edata, di)
before_ptr = edata.ptr
op = di.opcode
di.instruction.opname = op.name
bseq = edata.read(op.bin.length).unpack('C*') # decode_findopcode ensures that data >= op.length
pfx = di.instruction.prefix
field_val = lambda { |f|
if fld = op.fields[f]
(bseq[fld[0]] >> fld[1]) & @fields_mask[f]
end
}
op.args.each { |a|
di.instruction.args << case a
when :i8, :u8, :i16, :u16; Expression[edata.decode_imm(a, @endianness)]
when :iy; Expression[field_val[a]]
when :iy8; Expression[field_val[a]*8]
when :rp
v = field_val[a]
Reg.new(16, v)
when :rp2
v = field_val[a]
v = 4 if v == 3
Reg.new(16, v)
when :ry, :rz
v = field_val[a]
if v == 6
Memref.new(Reg.from_str('HL'), nil, 1)
else
Reg.new(8, v)
end
when :r_a; Reg.from_str('A')
when :r_af; Reg.from_str('AF')
when :r_hl; Reg.from_str('HL')
when :r_de; Reg.from_str('DE')
when :r_sp; Reg.from_str('SP')
when :r_i; Reg.from_str('I')
when :m16; Memref.new(nil, edata.decode_imm(:u16, @endianness), nil)
when :m_bc; Memref.new(Reg.from_str('BC'), nil, 1)
when :m_de; Memref.new(Reg.from_str('DE'), nil, 1)
when :m_sp; Memref.new(Reg.from_str('SP'), nil, 2)
when :m_hl; Memref.new(Reg.from_str('HL'), nil, 1)
when :mf8; Memref.new(nil, 0xff00 + edata.decode_imm(:u8, @endianness), 1)
when :mfc; Memref.new(Reg.from_str('C'), 0xff00, 1)
else raise SyntaxError, "Internal error: invalid argument #{a} in #{op.name}"
end
}
case pfx
when 0xDD
when 0xFD
when 0xCBDD
when 0xCBFD
end
di.bin_length += edata.ptr - before_ptr
return if edata.ptr > edata.length
di
end
# hash opcode_name => lambda { |dasm, di, *symbolic_args| instr_binding }
def backtrace_binding
@backtrace_binding ||= init_backtrace_binding
end
def backtrace_binding=(b) @backtrace_binding = b end
# populate the @backtrace_binding hash with default values
def init_backtrace_binding
@backtrace_binding ||= {}
mask = 0xffff
opcode_list.map { |ol| ol.basename }.uniq.sort.each { |op|
binding = case op
when 'ld'; lambda { |di, a0, a1, *aa| a2 = aa[0] ; a2 ? { a0 => Expression[a1, :+, a2] } : { a0 => Expression[a1] } }
when 'ldi'; lambda { |di, a0, a1| hl = (a0 == :a ? a1 : a0) ; { a0 => Expression[a1], hl => Expression[hl, :+, 1] } }
when 'ldd'; lambda { |di, a0, a1| hl = (a0 == :a ? a1 : a0) ; { a0 => Expression[a1], hl => Expression[hl, :-, 1] } }
when 'add', 'adc', 'sub', 'sbc', 'and', 'xor', 'or'
lambda { |di, a0, a1|
e_op = { 'add' => :+, 'adc' => :+, 'sub' => :-, 'sbc' => :-, 'and' => :&, 'xor' => :^, 'or' => :| }[op]
ret = Expression[a0, e_op, a1]
ret = Expression[ret, e_op, :flag_c] if op == 'adc' or op == 'sbc'
ret = Expression[ret.reduce] if not a0.kind_of? Indirection
{ a0 => ret }
}
when 'cp', 'cmp'; lambda { |di, *a| {} }
when 'inc'; lambda { |di, a0| { a0 => Expression[a0, :+, 1] } }
when 'dec'; lambda { |di, a0| { a0 => Expression[a0, :-, 1] } }
when 'not'; lambda { |di, a0| { a0 => Expression[a0, :^, mask] } }
when 'push'
lambda { |di, a0| { :sp => Expression[:sp, :-, 2],
Indirection[:sp, 2, di.address] => Expression[a0] } }
when 'pop'
lambda { |di, a0| { :sp => Expression[:sp, :+, 2],
a0 => Indirection[:sp, 2, di.address] } }
when 'call'
lambda { |di, a0| { :sp => Expression[:sp, :-, 2],
Indirection[:sp, 2, di.address] => Expression[di.next_addr] }
}
when 'ret', 'reti'; lambda { |di, *a| { :sp => Expression[:sp, :+, 2] } }
# TODO callCC, retCC ...
when 'bswap'
lambda { |di, a0| { a0 => Expression[
[[a0, :&, 0xff00], :>>, 8], :|,
[[a0, :&, 0x00ff], :<<, 8]] } }
when 'nop', /^j/; lambda { |di, *a| {} }
end
# TODO flags ?
@backtrace_binding[op] ||= binding if binding
}
@backtrace_binding
end
def get_backtrace_binding(di)
a = di.instruction.args.map { |arg|
case arg
when Memref, Reg; arg.symbolic(di)
else arg
end
}
if binding = backtrace_binding[di.opcode.basename]
binding[di, *a]
else
puts "unhandled instruction to backtrace: #{di}" if $VERBOSE
# assume nothing except the 1st arg is modified
case a[0]
when Indirection, Symbol; { a[0] => Expression::Unknown }
when Expression; (x = a[0].externals.first) ? { x => Expression::Unknown } : {}
else {}
end.update(:incomplete_binding => Expression[1])
end
end
# patch a forward binding from the backtrace binding
def fix_fwdemu_binding(di, fbd)
case di.opcode.name
when 'push', 'call'; fbd[Indirection[[:sp, :-, 2], 2]] = fbd.delete(Indirection[:sp, 2])
end
fbd
end
def get_xrefs_x(dasm, di)
return [] if not di.opcode.props[:setip]
case di.opcode.basename
when 'ret', 'reti'
return [Indirection[:sp, 2, di.address]]
when /^jr|^djnz/
# jmp/call are absolute addrs, only jr/djnz are relative
# also, the asm source should display the relative offset
return [Expression[[di.address, :+, di.bin_length], :+, di.instruction.args.first]]
end
case tg = di.instruction.args.first
when Memref; [Expression[tg.symbolic(di)]]
when Reg; [Expression[tg.symbolic(di)]]
when Expression, ::Integer; [Expression[tg]]
else
puts "unhandled setip at #{di.address} #{di.instruction}" if $DEBUG
[]
end
end
# checks if expr is a valid return expression matching the :saveip instruction
def backtrace_is_function_return(expr, di=nil)
expr = Expression[expr].reduce_rec
expr.kind_of?(Indirection) and expr.len == 2 and expr.target == Expression[:sp]
end
# updates the function backtrace_binding
# if the function is big and no specific register is given, do nothing (the binding will be lazily updated later, on demand)
def backtrace_update_function_binding(dasm, faddr, f, retaddrlist, *wantregs)
b = f.backtrace_binding
bt_val = lambda { |r|
next if not retaddrlist
b[r] = Expression::Unknown
bt = []
retaddrlist.each { |retaddr|
bt |= dasm.backtrace(Expression[r], retaddr, :include_start => true,
:snapshot_addr => faddr, :origin => retaddr)
}
if bt.length != 1
b[r] = Expression::Unknown
else
b[r] = bt.first
end
}
if not wantregs.empty?
wantregs.each(&bt_val)
else
bt_val[:sp]
end
b
end
# returns true if the expression is an address on the stack
def backtrace_is_stack_address(expr)
Expression[expr].expr_externals.include?(:sp)
end
# updates an instruction's argument replacing an expression with another (eg label renamed)
def replace_instr_arg_immediate(i, old, new)
i.args.map! { |a|
case a
when Expression; a == old ? new : Expression[a.bind(old => new).reduce]
when Memref
a.offset = (a.offset == old ? new : Expression[a.offset.bind(old => new).reduce]) if a.offset
a
else a
end
}
end
end
end
-67
View File
@@ -1,67 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
module Metasm
class Z80 < CPU
class Reg
class << self
attr_accessor :s_to_i, :i_to_s
end
@i_to_s = { 8 => { 0 => 'B', 1 => 'C', 2 => 'D', 3 => 'E',
4 => 'H', 5 => 'L', 7 => 'A' },
16 => { 0 => 'BC', 1 => 'DE', 2 => 'HL', 3 => 'SP',
4 => 'AF' } } # AF is 3 too
@s_to_i = @i_to_s.inject({}) { |h, (sz, rh)|
h.update rh.inject({}) { |hh, (i, n)|
hh.update n => [sz, i] } }
attr_accessor :sz, :i
def initialize(sz, i)
@sz = sz
@i = i
end
def symbolic(orig=nil) ; to_s.to_sym ; end
def self.from_str(s)
raise "Bad name #{s.inspect}" if not x = @s_to_i[s]
new(*x)
end
end
class Memref
attr_accessor :base, :offset, :sz
def initialize(base, offset, sz=nil)
@base = base
offset = Expression[offset] if offset
@offset = offset
@sz = sz
end
def symbolic(orig)
p = nil
p = Expression[p, :+, @base.symbolic] if base
p = Expression[p, :+, @offset] if offset
Indirection[p.reduce, @sz, orig]
end
end
def initialize(family = :latest)
super()
@endianness = :little
@size = 16
@family = family
end
def init_opcode_list
send("init_#@family")
@opcode_list
end
end
end
-224
View File
@@ -1,224 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/z80/main'
module Metasm
class Z80
def addop(name, bin, *args)
o = Opcode.new name, bin
args.each { |a|
o.args << a if @fields_mask[a] or @valid_args[a]
o.props[a] = true if @valid_props[a]
o.fields[a] = [bin.length-1, @fields_shift[a]] if @fields_mask[a]
raise "wtf #{a.inspect}" unless @valid_args[a] or @valid_props[a] or @fields_mask[a]
}
@opcode_list << o
end
def addop_macrocc(name, bin, *args)
%w[nz z nc c po pe p m].each_with_index { |cc, i|
dbin = bin.dup
dbin[0] |= i << 3
addop name + cc, dbin, *args
}
end
# data from http://www.z80.info/decoding.htm
def init_z80_common
@opcode_list = []
@valid_args.update [:i8, :u8, :i16, :u16, :m16,
:r_a, :r_af, :r_hl, :r_de, :r_sp, :r_i,
:m_bc, :m_de, :m_sp, :m_hl, :mf8, :mfc
].inject({}) { |h, v| h.update v => true }
@fields_mask.update :rz => 7, :ry => 7, :rp => 3, :rp2 => 3, :iy => 7, :iy8 => 7
@fields_shift.update :rz => 0, :ry => 3, :rp => 4, :rp2 => 4, :iy => 3, :iy8 => 3
# some opcodes are in init_z80 when they are not part of the GB ABI
addop 'nop', [0b00_000_000]
addop 'jr', [0b00_011_000], :setip, :stopexec, :i8
%w[nz z nc c].each_with_index { |cc, i|
addop 'jr' + cc, [0b00_100_000 | (i << 3)], :setip, :i8
}
addop 'ld', [0b00_000_001], :rp, :i16
addop 'add', [0b00_001_001], :r_hl, :rp
addop 'ld', [0b00_000_010], :m_bc, :r_a
addop 'ld', [0b00_001_010], :r_a, :m_bc
addop 'ld', [0b00_010_010], :m_de, :r_a
addop 'ld', [0b00_011_010], :r_a, :m_de
addop 'inc', [0b00_000_011], :rp
addop 'dec', [0b00_001_011], :rp
addop 'inc', [0b00_000_100], :ry
addop 'dec', [0b00_000_101], :ry
addop 'ld', [0b00_000_110], :ry, :i8
addop 'rlca', [0b00_000_111] # rotate
addop 'rrca', [0b00_001_111]
addop 'rla', [0b00_010_111]
addop 'rra', [0b00_011_111]
addop 'daa', [0b00_100_111]
addop 'cpl', [0b00_101_111]
addop 'scf', [0b00_110_111]
addop 'ccf', [0b00_111_111]
addop 'halt', [0b01_110_110] # ld (HL), (HL)
addop 'ld', [0b01_000_000], :ry, :rz
addop 'add', [0b10_000_000], :r_a, :rz
addop 'adc', [0b10_001_000], :r_a, :rz
addop 'sub', [0b10_010_000], :r_a, :rz
addop 'sbc', [0b10_011_000], :r_a, :rz
addop 'and', [0b10_100_000], :r_a, :rz
addop 'xor', [0b10_101_000], :r_a, :rz
addop 'or', [0b10_110_000], :r_a, :rz
addop 'cmp', [0b10_111_000], :r_a, :rz # alias cp
addop 'cp', [0b10_111_000], :r_a, :rz # compare
addop_macrocc 'ret', [0b11_000_000], :setip
addop 'pop', [0b11_000_001], :rp2
addop 'ret', [0b11_001_001], :stopexec, :setip
addop 'jmp', [0b11_101_001], :r_hl, :setip, :stopexec # alias jp
addop 'jp', [0b11_101_001], :r_hl, :setip, :stopexec
addop 'ld', [0b11_111_001], :r_sp, :r_hl
addop_macrocc 'j', [0b11_000_010], :setip, :u16 # alias jp
addop_macrocc 'jp', [0b11_000_010], :setip, :u16
addop 'jmp', [0b11_000_011], :setip, :stopexec, :u16 # alias jp
addop 'jp', [0b11_000_011], :setip, :stopexec, :u16
addop 'di', [0b11_110_011] # disable interrupts
addop 'ei', [0b11_111_011]
addop_macrocc 'call', [0b11_000_100], :u16, :setip, :saveip
addop 'push', [0b11_000_101], :rp2
addop 'call', [0b11_001_101], :u16, :setip, :saveip, :stopexec
addop 'add', [0b11_000_110], :r_a, :i8
addop 'adc', [0b11_001_110], :r_a, :i8
addop 'sub', [0b11_010_110], :r_a, :i8
addop 'sbc', [0b11_011_110], :r_a, :i8
addop 'and', [0b11_100_110], :r_a, :i8
addop 'xor', [0b11_101_110], :r_a, :i8
addop 'or', [0b11_110_110], :r_a, :i8
addop 'cp', [0b11_111_110], :r_a, :i8
addop 'rst', [0b11_000_111], :iy8 # call off in page 0
addop 'rlc', [0xCB, 0b00_000_000], :rz # rotate
addop 'rrc', [0xCB, 0b00_001_000], :rz
addop 'rl', [0xCB, 0b00_010_000], :rz
addop 'rr', [0xCB, 0b00_011_000], :rz
addop 'sla', [0xCB, 0b00_100_000], :rz # shift
addop 'sra', [0xCB, 0b00_101_000], :rz
addop 'srl', [0xCB, 0b00_111_000], :rz
addop 'bit', [0xCB, 0b01_000_000], :iy, :rz # bit test
addop 'res', [0xCB, 0b10_000_000], :iy, :rz # bit reset
addop 'set', [0xCB, 0b11_000_000], :iy, :rz # bit set
end
# standard z80
def init_z80
init_z80_common
addop 'ex', [0b00_001_000], :r_af # XXX really ex AF, AF' ...
addop 'djnz', [0b00_010_000], :setip, :i8
addop 'ld', [0b00_100_010], :m16, :r_hl
addop 'ld', [0b00_101_010], :r_hl, :m16
addop 'ld', [0b00_110_010], :m16, :r_a
addop 'ld', [0b00_111_010], :r_a, :m16
addop 'exx', [0b11_011_001]
addop 'out', [0b11_010_011], :i8, :r_a
addop 'in', [0b11_011_011], :r_a, :i8
addop 'ex', [0b11_100_011], :m_sp, :r_hl
addop 'ex', [0b11_101_011], :r_de, :r_hl
addop 'sll', [0xCB, 0b00_110_000], :rz
addop 'in', [0xED, 0b01_110_000], :u16
addop 'in', [0xED, 0b01_000_000], :ry, :u16
addop 'out', [0xED, 0b01_110_001], :u16
addop 'out', [0xED, 0b01_000_001], :u16, :ry
addop 'sbc', [0xED, 0b01_000_010], :r_hl, :rp
addop 'adc', [0xED, 0b01_001_010], :r_hl, :rp
addop 'ld', [0xED, 0b01_000_011], :m16, :rp
addop 'ld', [0xED, 0b01_001_011], :rp, :m16
addop 'neg', [0xED, 0b01_000_100], :r_a, :iy # dummy int field
addop 'retn', [0xED, 0b01_000_101], :stopexec # dummy int != 1 ? (1 = reti)
addop 'reti', [0xED, 0b01_001_101], :stopexec, :setip
addop 'im', [0xED, 0b01_000_110], :iy
addop 'ld', [0xED, 0b01_000_111], :r_i, :r_a
addop 'ld', [0xED, 0b01_001_111], :r_r, :r_a
addop 'ld', [0xED, 0b01_010_111], :r_a, :r_i
addop 'ld', [0xED, 0b01_011_111], :r_a, :r_r
addop 'rrd', [0xED, 0b01_100_111]
addop 'rld', [0xED, 0b01_101_111]
addop 'ldi', [0xED, 0b10_100_000]
addop 'ldd', [0xED, 0b10_101_000]
addop 'ldir', [0xED, 0b10_110_000]
addop 'lddr', [0xED, 0b10_111_000]
addop 'cpi', [0xED, 0b10_100_001]
addop 'cpd', [0xED, 0b10_101_001]
addop 'cpir', [0xED, 0b10_110_001]
addop 'cpdr', [0xED, 0b10_111_001]
addop 'ini', [0xED, 0b10_100_010]
addop 'ind', [0xED, 0b10_101_010]
addop 'inir', [0xED, 0b10_110_010]
addop 'indr', [0xED, 0b10_111_010]
addop 'outi', [0xED, 0b10_100_011]
addop 'outd', [0xED, 0b10_101_011]
addop 'otir', [0xED, 0b10_110_011]
addop 'otdr', [0xED, 0b10_111_011]
addop 'unk_ed', [0xED], :i8
addop 'unk_nop', [], :i8 # undefined opcode = nop
@unknown_opcode = @opcode_list.last
end
# gameboy processor
# from http://nocash.emubase.de/pandocs.htm#cpucomparisionwithz80
def init_gb
init_z80_common
addop 'ld', [0x08], :m16, :r_sp
addop 'stop', [0x10]
addop 'ldi', [0x22], :m_hl, :r_a # (hl++) <- a
addop 'ldi', [0x2A], :r_a, :m_hl
addop 'ldd', [0x32], :m_hl, :r_a # (hl--) <- a
addop 'ldd', [0x3A], :r_a, :m_hl
addop 'reti', [0xD9], :setip, :stopexec
# override retpo/jpo
@opcode_list.delete_if { |op| op.bin[0] & 0xE5 == 0xE0 } # rm E0 E2 E8 EA F0 F2 F8 FA
addop 'ld', [0xE0], :mf8, :r_a # (0xff00 + :i8)
addop 'ld', [0xE2], :mfc, :r_a # (0xff00 + :r_c)
addop 'add', [0xE8], :r_sp, :i8
addop 'ld', [0xEA], :m16, :r_a
addop 'ld', [0xF0], :r_a, :mf8
addop 'ld', [0xF2], :r_a, :mfc
addop 'ld', [0xF8], :r_hl, :r_sp, :i8 # hl <- sp+:i8
addop 'ld', [0xFA], :r_a, :m16
addop 'swap', [0xCB, 0x30], :rz
addop 'inv_dd', [0xDD], :stopexec # invalid prefixes
addop 'inv_ed', [0xED], :stopexec
addop 'inv_fd', [0xFD], :stopexec
addop 'unk_nop', [], :i8 # undefined opcode = nop
@unknown_opcode = @opcode_list.last
end
alias init_latest init_z80
end
end
-59
View File
@@ -1,59 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/z80/opcodes'
require 'metasm/render'
module Metasm
class Z80
class Reg
include Renderable
def render ; [self.class.i_to_s[@sz][@i]] end
end
class Memref
include Renderable
def render
r = ['(']
r << @base if @base
r << '+' if @base and @offset
r << @offset if @offset
r << ')'
end
end
def render_instruction(i)
r = []
r << i.opname
if not i.args.empty?
r << ' '
i.args.each { |a_| r << a_ << ', ' }
r.pop
end
r
end
def gui_hilight_word_regexp_init
ret = {}
# { 'B' => 'B|BC', 'BC' => 'B|C|BC' }
%w[BC DE HL].each { |w|
l0, l1 = w.split(//)
ret[l0] = "#{l0}#{l1}?"
ret[l1] = "#{l0}?#{l1}"
ret[w] = "#{l0}|#{l0}?#{l1}"
}
ret
end
def gui_hilight_word_regexp(word)
@gui_hilight_word_hash ||= gui_hilight_word_regexp_init
@gui_hilight_word_hash[word] or super(word)
end
end
end
@@ -5,5 +5,4 @@
require 'metasm/main'
require 'metasm/cpu/z80/decode'
require 'metasm/cpu/z80/render'
require 'metasm/dalvik/decode'
@@ -3,7 +3,7 @@
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/cpu/dalvik/opcodes'
require 'metasm/dalvik/opcodes'
require 'metasm/decode'
module Metasm
@@ -12,7 +12,7 @@ class Dalvik
end
def decode_findopcode(edata)
return if edata.ptr+2 > edata.length
return if edata.ptr >= edata.data.length
di = DecodedInstruction.new(self)
di.opcode = opcode_list[edata.decode_imm(:u16, @endianness) & 0xff]
edata.ptr -= 2
@@ -22,7 +22,7 @@ class Dalvik
def decode_instr_op(edata, di)
op = di.opcode
di.instruction.opname = op.name
val = [edata.decode_imm(:u16, @endianness)]
op.args.each { |a|
@@ -80,7 +80,7 @@ class Dalvik
Expression[Expression.make_signed((val[1] >> 8) & 0xff, 8)]
when :rlist4, :rlist5
cnt = (val[0] >> 12) & 0xf
val << edata.decode_imm(:u16, @endianness)
val << edata.decode_imm(:u16, @endianness)
[cnt, 4].min.times {
di.instruction.args << Reg.new(val[-1] & 0xf)
val[-1] >>= 4
@@ -96,40 +96,20 @@ class Dalvik
next
when :m16
val << edata.decode_imm(:u16, @endianness)
DexMethod.new(@dex, val.last)
when :fld16
val << edata.decode_imm(:u16, @endianness)
DexField.new(@dex, val.last)
when :typ16
val << edata.decode_imm(:u16, @endianness)
DexType.new(@dex, val.last)
when :str16
val << edata.decode_imm(:u16, @endianness)
DexString.new(@dex, val.last)
Method.new(@dex, val.last)
else raise SyntaxError, "Internal error: invalid argument #{a} in #{op.name}"
end
}
di.bin_length = val.length*2
return if edata.ptr > edata.length
di
end
def decode_instr_interpret(di, addr)
if di.opcode.props[:setip] and di.instruction.args.last.kind_of? Expression and di.instruction.opname =~ /^if|^goto/
arg = Expression[addr, :+, [di.instruction.args.last, :*, 2]].reduce
di.instruction.args[-1] = Expression[arg]
end
di
end
def backtrace_binding
@backtrace_binding ||= init_backtrace_binding
end
def init_backtrace_binding
@backtrace_binding ||= {}
sz = @size/8
@@ -137,12 +117,12 @@ class Dalvik
case op.name
when /invoke/
@backtrace_binding[op.name] = lambda { |di, *args| {
:callstack => Expression[:callstack, :-, sz],
:callstack => Expression[:callstack, :-, sz],
Indirection[:callstack, sz] => Expression[di.next_addr]
} }
when /return/
@backtrace_binding[op.name] = lambda { |di, *args| {
:callstack => Expression[:callstack, :+, sz]
:callstack => Expression[:callstack, :+, sz]
} }
end
}
@@ -156,9 +136,9 @@ class Dalvik
else arg
end
}
if binding = backtrace_binding[di.opcode.name]
binding[di, *a]
bd = binding[di, *a]
else
puts "unhandled instruction to backtrace: #{di}" if $VERBOSE
# assume nothing except the 1st arg is modified
@@ -170,22 +150,20 @@ class Dalvik
end
end
def get_xrefs_x(dasm, di)
if di.opcode.props[:saveip]
m = di.instruction.args.first
if m.kind_of?(DexMethod) and m.off
if m.kind_of? Method and m.off
[m.off]
else
[:default]
end
elsif di.opcode.props[:setip]
if di.opcode.name =~ /^return/
if di.opcode.name =~ /return/
[Indirection[:callstack, @size/8]]
elsif di.opcode.name =~ /^if|^goto/
[di.instruction.args.last]
else
[] # [di.instruction.args.last]
[] # [di.instruction.args.last]
end
else
[]
@@ -23,14 +23,13 @@ class Dalvik < CPU
end
end
class DexMethod
class Method
attr_accessor :dex, :midx, :off
def initialize(dex, midx)
@dex = dex
@midx = midx
if @dex and m = @dex.methods[midx] and c = @dex.classes[m.classidx] and c.data and
me = (c.data.direct_methods+c.data.virtual_methods).find { |mm| mm.methodid == midx }
# FIXME this doesnt work
me = (c.data.direct_methods+c.data.virtual_methods).find { |mm| mm.method == m }
@off = me.codeoff + me.code.insns_off
end
end
@@ -45,54 +44,6 @@ class Dalvik < CPU
end
end
class DexField
attr_accessor :dex, :fidx
def initialize(dex, fidx)
@dex = dex
@fidx = fidx
end
def to_s
if @dex and f = @dex.fields[@fidx]
@dex.types[f.classidx] + '->' + @dex.strings[f.nameidx]
else
"field_#@fidx"
end
end
end
class DexType
attr_accessor :dex, :tidx
def initialize(dex, tidx)
@dex = dex
@tidx = tidx
end
def to_s
if @dex and f = @dex.types[@tidx]
f
else
"type_#@tidx"
end
end
end
class DexString
attr_accessor :dex, :sidx
def initialize(dex, sidx)
@dex = dex
@sidx = sidx
end
def to_s
if @dex and f = @dex.strings[@sidx]
f.inspect
else
"string_#@sidx"
end
end
end
def initialize(*args)
super()
@size = args.grep(Integer).first || 32
@@ -11,7 +11,7 @@
# the opcode number is in the low-order byte, and determines the
# argument format, which may take up to 4 other words
require 'metasm/cpu/dalvik/main'
require 'metasm/dalvik/main'
module Metasm
class Dalvik
@@ -61,11 +61,9 @@ invoke_virtual_quick invoke_virtual_quick_range invoke_super_quick invoke_super_
unused_fc unused_fd unused_fe unused_ff]
def init_dalvik
@valid_props[:canthrow] = true
[:i16, :i16_32hi, :i16_64hi, :i32, :iaa, :ib, :icc, :u16, :u32, :u64,
:r16, :ra, :raa, :rb, :rbb, :rcc, :rlist16, :rlist4, :rlist5,
:m16, :fld16, :typ16, :str16
].each { |a| @valid_args[a] = true }
@valid_props << :canthrow
@valid_args = [:i16, :i16_32hi, :i16_64hi, :i32, :iaa, :ib, :icc, :u16, :u32, :u64,
:r16, :ra, :raa, :rb, :rbb, :rcc, :rlist16, :rlist4, :rlist5, :m16]
@opcode_list = []
OPCODES.each_with_index { |n, b|
@@ -82,7 +80,7 @@ unused_fc unused_fd unused_fe unused_ff]
def addop_args(op)
fmt = case op.name
when 'goto'
:fmt10t
:fmt10t
when 'nop', 'return_void'
:fmt10x
when 'const_4'
@@ -121,16 +119,12 @@ unused_fc unused_fd unused_fe unused_ff]
:fmt20t
when 'goto_32'
:fmt30t
when 'const_string'
:fmt21c_str
when 'const_class', 'check_cast',
'new_instance'
:fmt21c_typ
when 'sget', 'sget_wide', 'sget_object',
when 'const_string', 'const_class', 'check_cast',
'new_instance', 'sget', 'sget_wide', 'sget_object',
'sget_boolean', 'sget_byte', 'sget_char', 'sget_short',
'sput', 'sput_wide', 'sput_object', 'sput_boolean',
'sput_byte', 'sput_char', 'sput_short'
:fmt21c_fld
:fmt21c
when 'const_16', 'const_wide_16'
:fmt21s
when 'if_eqz', 'if_nez', 'if_ltz', 'if_gez', 'if_gtz', 'if_lez'
@@ -220,9 +214,7 @@ unused_fc unused_fd unused_fe unused_ff]
when :fmt10t; op.args << :iaa
when :fmt20t; op.args << :i16
when :fmt20bc; op.args << :iaa << :u16
when :fmt21c_str; op.args << :raa << :str16
when :fmt21c_typ; op.args << :raa << :typ16
when :fmt21c_fld; op.args << :raa << :fld16
when :fmt21c; op.args << :raa << :u16
when :fmt22x; op.args << :raa << :r16
when :fmt21s, :fmt21t; op.args << :raa << :i16
when :fmt21h; op.args << :raa << :i16_32hi
@@ -230,7 +222,7 @@ unused_fc unused_fd unused_fe unused_ff]
when :fmt23x; op.args << :raa << :rbb << :rcc
when :fmt22b; op.args << :raa << :rbb << :icc
when :fmt22s, :fmt22t; op.args << :ra << :rb << :i16
when :fmt22c, :fmt22cs; op.args << :ra << :rb << :fld16
when :fmt22c, :fmt22cs; op.args << :ra << :rb << :u16
when :fmt30t; op.args << :i32
when :fmt31t, :fmt31c; op.args << :raa << :u32
when :fmt32x; op.args << :r16 << :r16
@@ -246,7 +238,7 @@ unused_fc unused_fd unused_fe unused_ff]
when :fmt3inline
op.args << :r16 << :rlist4
when :fmt3rc, :fmt3rms
# rlist = :r16, :r16+1, :r16+2, ..., :r16+:iaa-1
# rlist = :r16, :r16+1, :r16+2, ..., :r16+:iaa-1
op.args << :r16 << :rlist16
when :fmt51l
# u64 = u16 | (u16 << 16) | ...
-1445
View File
@@ -1,1445 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
module Metasm
# this class implements a high-level debugging API (abstract superclass)
class Debugger
class Breakpoint
attr_accessor :address,
# context where the bp was defined
:pid, :tid,
# bool: oneshot ?
:oneshot,
# current bp state: :active, :inactive (internal use), :disabled (user-specified)
:state,
# type: type of breakpoint (:bpx = soft, :hwbp = hard, :bpm = memory)
:type,
# Expression if this is a conditionnal bp
# may be a Proc, String or Expression, evaluated every time the breakpoint hits
# if it returns 0 or false, the breakpoint is ignored
:condition,
# Proc to run if this bp has a callback
:action,
# Proc to run to emulate the overwritten instr behavior
# used to avoid unset/singlestep/re-set, more multithread friendly
# may be a DecodedInstruction for lazy initialization, see Debugger#init_bpx/has_emul_instr(bpx)
:emul_instr,
# internal data, cpu-specific (overwritten byte for a softbp, memory type/size for hwbp..)
:internal,
# reference breakpoints sharing a target implementation (same hw debug register, soft bp addr...)
# shared is an array of Breakpoints, the same Array object in all shared breakpoints
# owner is a hash key => shared (dbg.breakpoint)
# key is an identifier for the Bp class in owner (bp.address)
:hash_shared, :hash_owner, :hash_key,
# user-defined breakpoint-specific stuff
:userdata
# append the breakpoint to hash_owner + hash_shared
def add(owner=@hash_owner)
@hash_owner = owner
@hash_key ||= @address
return add_bpm if @type == :bpm
if pv = owner[@hash_key]
@hash_shared = pv.hash_shared
@internal ||= pv.internal
@emul_instr ||= pv.emul_instr
else
owner[@hash_key] = self
@hash_shared = []
end
@hash_shared << self
end
# register a bpm: add references to all page start covered in @hash_owner
def add_bpm
m = @address + @internal[:len]
a = @address & -0x1000
@hash_shared = [self]
@internal ||= {}
@internal[:orig_prot] ||= {}
while a < m
if pv = @hash_owner[a]
if not pv.hash_shared.include?(self)
pv.hash_shared.concat @hash_shared-pv.hash_shared
@hash_shared.each { |bpm| bpm.hash_shared = pv.hash_shared }
end
@internal[:orig_prot][a] = pv.internal[:orig_prot][a]
else
@hash_owner[a] = self
end
a += 0x1000
end
end
# delete the breakpoint from hash_shared, and hash_owner if empty
def del
return del_bpm if @type == :bpm
@hash_shared.delete self
if @hash_shared.empty?
@hash_owner.delete @hash_key
elsif @hash_owner[@hash_key] == self
@hash_owner[@hash_key] = @hash_shared.first
end
end
# unregister a bpm
def del_bpm
m = @address + @internal[:len]
a = @address & -0x1000
@hash_shared.delete self
while a < m
pv = @hash_owner[a]
if pv == self
if opv = @hash_shared.find { |bpm|
bpm.address < a + 0x1000 and bpm.address + bpm.internal[:len] > a
}
@hash_owner[a] = opv
else
@hash_owner.delete a
# split hash_shared on disjoint ranges
prev_shared = @hash_shared.find_all { |bpm|
bpm.address < a + 0x1000 and bpm.address + bpm.internal[:len] <= a
}
prev_shared.each { |bpm|
bpm.hash_shared = prev_shared
@hash_shared.delete bpm
}
end
end
a += 0x1000
end
end
end
# per-process data
attr_accessor :memory, :cpu, :disassembler, :breakpoint, :breakpoint_memory,
:modulemap, :symbols, :symbols_len
# per-thread data
attr_accessor :state, :info, :breakpoint_thread, :singlestep_cb, :run_method,
:run_args, :breakpoint_cause
# which/where per-process/thread stuff is stored
attr_accessor :pid_stuff, :tid_stuff, :pid_stuff_list, :tid_stuff_list
# global debugger callbacks, called whenever such event occurs
attr_accessor :callback_singlestep, :callback_bpx, :callback_hwbp, :callback_bpm,
:callback_exception, :callback_newthread, :callback_endthread,
:callback_newprocess, :callback_endprocess, :callback_loadlibrary
# global switches, specify wether to break on exception/thread event
# can be a Proc that is evaluated (arg = info parameter of the evt_func)
# trace_children is a bool to tell if we should debug subprocesses spawned
# by the target
attr_accessor :pass_all_exceptions, :ignore_newthread, :ignore_endthread,
:trace_children
# link to the user-interface object if available
attr_accessor :gui
# initializes the disassembler internal data - subclasses should call super()
def initialize
@pid_stuff = {}
@tid_stuff = {}
@log_proc = nil
@state = :dead
@info = ''
# stuff saved when we switch pids
@pid_stuff_list = [:memory, :cpu, :disassembler, :symbols, :symbols_len,
:modulemap, :breakpoint, :breakpoint_memory, :tid, :tid_stuff,
:dead_process]
@tid_stuff_list = [:state, :info, :breakpoint_thread, :singlestep_cb,
:run_method, :run_args, :breakpoint_cause, :dead_thread]
@callback_loadlibrary = lambda { |h| loadsyms(h[:address]) ; continue }
@callback_newprocess = lambda { |h| log "process #{@pid} attached" }
@callback_endprocess = lambda { |h| log "process #{@pid} died" }
initialize_newpid
initialize_newtid
end
def dasm; disassembler; end
def shortname; self.class.name.split('::').last.downcase; end
attr_reader :pid
# change pid and associated cached data
# this will also re-load the previously selected tid for this process
def pid=(npid)
return if npid == pid
raise "invalid pid" if not check_pid(npid)
swapout_pid
@pid = npid
swapin_pid
end
alias set_pid pid=
attr_reader :tid
def tid=(ntid)
return if ntid == tid
raise "invalid tid" if not check_tid(ntid)
swapout_tid
@tid = ntid
swapin_tid
end
alias set_tid tid=
# creates stuff related to a new process being debugged
# includes disassembler, modulemap, symbols, breakpoints
# subclasses should check that @pid maps to a real process and raise() otherwise
# to be called with @pid/@tid set, calls initialize_memory+initialize_cpu
def initialize_newpid
return if not pid
@pid_stuff_list.each { |s| instance_variable_set("@#{s}", nil) }
@symbols = {}
@symbols_len = {}
@modulemap = {}
@breakpoint = {}
@breakpoint_memory = {}
@tid_stuff = {}
initialize_cpu
initialize_memory
initialize_disassembler
end
# subclasses should check that @tid maps to a real thread and raise() otherwise
def initialize_newtid
return if not tid
@tid_stuff_list.each { |s| instance_variable_set("@#{s}", nil) }
@state = :stopped
@info = 'new'
@breakpoint_thread = {}
gui.swapin_tid if @disassembler and gui.respond_to?(:swapin_tid)
end
# initialize the disassembler from @cpu/@memory
def initialize_disassembler
return if not @memory or not @cpu
@disassembler = Shellcode.decode(@memory, @cpu).disassembler
gui.swapin_pid if gui.respond_to?(:swapin_pid)
end
# we're switching focus from one pid to another, save current pid data
def swapout_pid
return if not pid
swapout_tid
gui.swapout_pid if gui.respond_to?(:swapout_pid)
@pid_stuff[@pid] ||= {}
@pid_stuff_list.each { |fld|
@pid_stuff[@pid][fld] = instance_variable_get("@#{fld}")
}
end
# we're switching focus from one tid to another, save current tid data
def swapout_tid
return if not tid
gui.swapout_tid if gui.respond_to?(:swapout_tid)
@tid_stuff[@tid] ||= {}
@tid_stuff_list.each { |fld|
@tid_stuff[@tid][fld] = instance_variable_get("@#{fld}")
}
end
# we're switching focus from one pid to another, load current pid data
def swapin_pid
return initialize_newpid if not @pid_stuff[@pid]
@pid_stuff_list.each { |fld|
instance_variable_set("@#{fld}", @pid_stuff[@pid][fld])
}
swapin_tid
gui.swapin_pid if gui.respond_to?(:swapin_pid)
end
# we're switching focus from one tid to another, load current tid data
def swapin_tid
return initialize_newtid if not @tid_stuff[@tid]
@tid_stuff_list.each { |fld|
instance_variable_set("@#{fld}", @tid_stuff[@tid][fld])
}
gui.swapin_tid if gui.respond_to?(:swapin_tid)
end
# delete references to the current pid
# switch to another pid, set @state = :dead if none available
def del_pid
@pid_stuff.delete @pid
if @pid = @pid_stuff.keys.first
swapin_pid
else
@state = :dead
@info = ''
@tid = nil
end
end
# delete references to the current thread
def del_tid
@tid_stuff.delete @tid
if @tid = @tid_stuff.keys.first
swapin_tid
else
del_tid_notid
end
end
# wipe the whole process when no TID is left
# XXX we may have a pending evt_newthread...
def del_tid_notid
del_pid
end
# change the debugger to a specific pid/tid
# if given a block, run the block and then restore the original pid/tid
# pid may be an object that respond to #pid/#tid
def switch_context(npid, ntid=nil, &b)
if npid.respond_to?(:pid)
ntid ||= npid.tid
npid = npid.pid
end
oldpid = pid
oldtid = tid
set_pid npid
set_tid ntid if ntid
if b
# shortcut begin..ensure overhead
return b.call if oldpid == pid and oldtid == tid
begin
b.call
ensure
set_pid oldpid
set_tid oldtid
end
end
end
alias set_context switch_context
# iterate over all pids, yield in the context of this pid
def each_pid(&b)
# ensure @pid is last, so that we finish in the current context
lst = @pid_stuff.keys - [@pid]
lst << @pid
return lst if not b
lst.each { |p|
set_pid p
b.call
}
end
# iterate over all tids of the current process, yield in its context
def each_tid(&b)
lst = @tid_stuff.keys - [@tid]
lst << @tid
return lst if not b
lst.each { |t|
set_tid t rescue next
b.call
}
end
# iterate over all tids of all pids, yield in their context
def each_pid_tid(&b)
each_pid { each_tid { b.call } }
end
# create a thread/process breakpoint
# addr can be a numeric address, an Expression that is resolved, or
# a String that is parsed+resolved
# info's keys are set to the breakpoint
# standard keys are :type, :oneshot, :condition, :action
# returns the Breakpoint object
def add_bp(addr, info={})
info[:pid] ||= @pid
# dont define :tid for bpx, otherwise on del_bp we may switch_context to this thread that may not be stopped -> cant ptrace_write
info[:tid] ||= @tid if info[:pid] == @pid and info[:type] == :hwbp
b = Breakpoint.new
info.each { |k, v|
b.send("#{k}=", v)
}
switch_context(b) {
addr = resolve_expr(addr) if not addr.kind_of? ::Integer
b.address = addr
b.hash_owner ||= case b.type
when :bpm; @breakpoint_memory
when :hwbp; @breakpoint_thread
when :bpx; @breakpoint
end
# XXX bpm may hash_share with an :active, but be larger and still need enable()
b.add
enable_bp(b) if not info[:state]
}
b
end
# remove a breakpoint
def del_bp(b)
disable_bp(b)
b.del
end
# activate an inactive breakpoint
def enable_bp(b)
return if b.state == :active
if not b.hash_shared.find { |bb| bb.state == :active }
switch_context(b) {
if not b.internal
init_bpx(b) if b.type == :bpx
b.internal ||= {}
b.hash_shared.each { |bb| bb.internal ||= b.internal }
end
do_enable_bp(b)
}
end
b.state = :active
end
# deactivate an active breakpoint
def disable_bp(b, newstate = :inactive)
return if b.state != :active
b.state = newstate
return if b.hash_shared.find { |bb| bb.state == :active }
switch_context(b) {
do_disable_bp(b)
}
end
# delete all breakpoints defined in the current thread
def del_all_breakpoints_thread
@breakpoint_thread.values.map { |b| b.hash_shared }.flatten.uniq.each { |b| del_bp(b) }
end
# delete all breakpoints for the current process and all its threads
def del_all_breakpoints
each_tid { del_all_breakpoints_thread }
@breakpoint.values.map { |b| b.hash_shared }.flatten.uniq.each { |b| del_bp(b) }
@breakpoint_memory.values.uniq.map { |b| b.hash_shared }.flatten.uniq.each { |b| del_bp(b) }
end
# calls do_enable_bpm for bpms, or @cpu.dbg_enable_bp
def do_enable_bp(b)
if b.type == :bpm; do_enable_bpm(b)
else @cpu.dbg_enable_bp(self, b)
end
end
# calls do_disable_bpm for bpms, or @cpu.dbg_disable_bp
def do_disable_bp(b)
if b.type == :bpm; do_disable_bpm(b)
else @cpu.dbg_disable_bp(self, b)
end
end
# called in the context of the target when a bpx is to be initialized
# may (lazily) initialize b.emul_instr for virtual singlestep
def init_bpx(b)
# dont bother setting stuff up if it is never to be used
return if b.oneshot and not b.condition
# lazy setup of b.emul_instr: delay building emulating lambda to if/when actually needed
# we still need to disassemble now and update @disassembler, before we patch the memory for the bpx
di = init_bpx_disassemble(b.address)
b.hash_shared.each { |bb| bb.emul_instr = di }
end
# retrieve the di at a given address, disassemble if needed
# TODO make it so this doesn't interfere with other 'real' disassembler later commands, eg disassemble() or disassemble_fast_deep()
# (right now, when they see the block already present they stop all processing)
def init_bpx_disassemble(addr)
@disassembler.disassemble_fast_block(addr)
@disassembler.di_at(addr)
end
# checks if bp has an emul_instr
# do the lazy initialization if needed
def has_emul_instr(bp)
if bp.emul_instr.kind_of?(DecodedInstruction)
if di = bp.emul_instr and fdbd = @disassembler.get_fwdemu_binding(di, register_pc) and
fdbd.all? { |k, v| (k.kind_of?(Symbol) or k.kind_of?(Indirection)) and
k != :incomplete_binding and v != Expression::Unknown }
# setup a lambda that will mimic, using the debugger primitives, the actual execution of the instruction
bp.emul_instr = lambda {
fdbd.map { |k, v|
k = Indirection[emulinstr_resv(k.pointer), k.len] if k.kind_of?(Indirection)
[k, emulinstr_resv(v)]
}.each { |k, v|
if k.to_s =~ /flags?_(.+)/i
f = $1.downcase.to_sym
set_flag_value(f, v)
elsif k.kind_of?(Symbol)
set_reg_value(k, v)
elsif k.kind_of?(Indirection)
memory_write_int(k.pointer, v, k.len)
end
}
}
bp.hash_shared.each { |bb| bb.emul_instr = bp.emul_instr }
else
bp.hash_shared.each { |bb| bb.emul_instr = nil }
end
end
bp.emul_instr
end
def emulinstr_resv(e)
r = e
flags = Expression[r].externals.uniq.find_all { |f| f.to_s =~ /flags?_(.+)/i }
if flags.first
bd = {}
flags.each { |f|
f.to_s =~ /flags?_(.+)/i
bd[f] = get_flag_value($1.downcase.to_sym)
}
r = r.bind(bd)
end
resolve(r)
end
# sets a breakpoint on execution
def bpx(addr, oneshot=false, cond=nil, &action)
h = { :type => :bpx }
h[:oneshot] = true if oneshot
h[:condition] = cond if cond
h[:action] = action if action
add_bp(addr, h)
end
# sets a hardware breakpoint
# mtype in :r :w :x
# mlen is the size of the memory zone to cover
# mlen may be constrained by the architecture
def hwbp(addr, mtype=:x, mlen=1, oneshot=false, cond=nil, &action)
h = { :type => :hwbp }
h[:hash_owner] = @breakpoint_thread
addr = resolve_expr(addr) if not addr.kind_of? ::Integer
mtype = mtype.to_sym
h[:hash_key] = [addr, mtype, mlen]
h[:internal] = { :type => mtype, :len => mlen }
h[:oneshot] = true if oneshot
h[:condition] = cond if cond
h[:action] = action if action
add_bp(addr, h)
end
# sets a memory breakpoint
# mtype is :r :w :rw or :x
# mlen is the size of the memory zone to cover
def bpm(addr, mtype=:r, mlen=4096, oneshot=false, cond=nil, &action)
h = { :type => :bpm }
addr = resolve_expr(addr) if not addr.kind_of? ::Integer
h[:hash_key] = addr & -4096 # XXX actually referenced at addr, addr+4096, ... addr+len
h[:internal] = { :type => mtype, :len => mlen }
h[:oneshot] = true if oneshot
h[:condition] = cond if cond
h[:action] = action if action
add_bp(addr, h)
end
# define the lambda to use to log stuff
def set_log_proc(l=nil, &b)
@log_proc = l || b
end
# show information to the user, uses log_proc if defined
def log(*a)
if @log_proc
a.each { |aa| @log_proc[aa] }
else
puts(*a) if $VERBOSE
end
end
# marks the current cache of memory/regs invalid
def invalidate
@memory.invalidate if @memory
end
# invalidates the EncodedData backend for the dasm sections
def dasm_invalidate
disassembler.sections.each_value { |s| s.data.invalidate if s.data.respond_to?(:invalidate) } if disassembler
end
# return all breakpoints set on a specific address (or all bp)
def all_breakpoints(addr=nil)
ret = []
if addr
if b = @breakpoint[addr]
ret |= b.hash_shared
end
else
@breakpoint.each_value { |bb| ret |= bb.hash_shared }
end
@breakpoint_thread.each_value { |bb|
next if addr and bb.address != addr
ret |= bb.hash_shared
}
@breakpoint_memory.each_value { |bb|
next if addr and (bb.address+bb.internal[:len] <= addr or bb.address > addr)
ret |= bb.hash_shared
}
ret
end
# return on of the breakpoints at address addr
def find_breakpoint(addr=nil, &b)
return @breakpoint[addr] if @breakpoint[addr] and (not b or b.call(@breakpoint[addr]))
all_breakpoints(addr).find { |bp| b.call bp }
end
# to be called right before resuming execution of the target
# run_m is the method that should be called if the execution is stopped
# due to a side-effect of the debugger (bpx with wrong condition etc)
# returns nil if the execution should be avoided (just deleted the dead thread/process)
def check_pre_run(run_m, *run_a)
if @dead_process
del_pid
return
elsif @dead_thread
del_tid
return
elsif @state == :running
return
end
@cpu.dbg_check_pre_run(self) if @cpu.respond_to?(:dbg_check_pre_run)
@breakpoint_cause = nil
@run_method = run_m
@run_args = run_a
@info = nil
true
end
# called when the target stops due to a singlestep exception
def evt_singlestep(b=nil)
b ||= find_singlestep
return evt_exception(:type => 'singlestep') if not b
@state = :stopped
@info = 'singlestep'
@cpu.dbg_evt_singlestep(self) if @cpu.respond_to?(:dbg_evt_singlestep)
callback_singlestep[] if callback_singlestep
if cb = @singlestep_cb
@singlestep_cb = nil
cb.call # call last, as the cb may change singlestep_cb/state/etc
end
end
# returns true if the singlestep is due to us
def find_singlestep
return @cpu.dbg_find_singlestep(self) if @cpu.respond_to?(:dbg_find_singlestep)
@run_method == :singlestep
end
# called when the target stops due to a soft breakpoint exception
def evt_bpx(b=nil)
b ||= find_bp_bpx
# TODO handle race:
# bpx foo ; thread hits foo ; we bc foo ; os notify us of bp hit but we already cleared everything related to 'bpx foo' -> unhandled bp exception
return evt_exception(:type => 'breakpoint') if not b
@state = :stopped
@info = 'breakpoint'
@cpu.dbg_evt_bpx(self, b) if @cpu.respond_to?(:dbg_evt_bpx)
callback_bpx[b] if callback_bpx
post_evt_bp(b)
end
# return the breakpoint that is responsible for the evt_bpx
def find_bp_bpx
return @cpu.dbg_find_bpx(self) if @cpu.respond_to?(:dbg_find_bpx)
@breakpoint[pc]
end
# called when the target stops due to a hwbp exception
def evt_hwbp(b=nil)
b ||= find_bp_hwbp
return evt_exception(:type => 'hwbp') if not b
@state = :stopped
@info = 'hwbp'
@cpu.dbg_evt_hwbp(self, b) if @cpu.respond_to?(:dbg_evt_hwbp)
callback_hwbp[b] if callback_hwbp
post_evt_bp(b)
end
# return the breakpoint that is responsible for the evt_hwbp
def find_bp_hwbp
return @cpu.dbg_find_hwbp(self) if @cpu.respond_to?(:dbg_find_hwbp)
@breakpoint_thread.find { |b| b.address == pc }
end
# called for archs where the same interrupt is generated for hwbp and singlestep
# checks if a hwbp matches, then call evt_hwbp, else call evt_singlestep (which
# will forward to evt_exception if singlestep does not match either)
def evt_hwbp_singlestep
if b = find_bp_hwbp
evt_hwbp(b)
else
evt_singlestep
end
end
# called when the target stops due to a memory exception caused by a memory bp
# called by evt_exception
def evt_bpm(b)
@state = :stopped
@info = 'bpm'
callback_bpm[b] if callback_bpm
post_evt_bp(b)
end
# return a bpm whose page coverage includes the fault described in info
def find_bp_bpm(info)
@breakpoint_memory[info[:fault_addr] & -0x1000]
end
# returns true if the fault described in info is valid to trigger b
def check_bpm_range(b, info)
return if b.address+b.internal[:len] <= info[:fault_addr]
return if b.address >= info[:fault_addr] + info[:fault_len]
case b.internal[:type]
when :r; info[:fault_access] == :r # or info[:fault_access] == :x
when :w; info[:fault_access] == :w
when :x; info[:fault_access] == :x # XXX non-NX cpu => check pc is in bpm range ?
when :rw; true
end
end
# handles breakpoint conditions/callbacks etc
def post_evt_bp(b)
@breakpoint_cause = b
found_valid_active = false
pre_callback_pc = pc
# XXX may have many active bps with callback that continue/singlestep/singlestep{}...
b.hash_shared.dup.find_all { |bb|
# ignore inactive bps
next if bb.state != :active
# ignore out-of-range bpms
next if bb.type == :bpm and not check_bpm_range(bb, b.internal)
# check condition
case bb.condition
when nil; cd = 1
when Proc; cd = bb.condition.call
when String, Expression; cd = resolve_expr(bb.condition)
else raise "unknown bp condition #{bb.condition.inspect}"
end
next if not cd or cd == 0
found_valid_active = true
# oneshot
del_bp(bb) if bb.oneshot
bb.action
}.each { |bb| bb.action.call }
# discard @breakpoint_cause if a bp callback did modify register_pc
@breakpoint_cause = nil if pc != pre_callback_pc
# we did break due to a bp whose condition is not true: resume
# (unless a callback already resumed)
resume_badbreak(b) if not found_valid_active and @state == :stopped
end
# called whenever the target stops due to an exception
# type may be:
# * 'access violation', :fault_addr, :fault_len, :fault_access (:r/:w/:x)
# anything else for other exceptions (access violation is special to handle bpm)
# ...
def evt_exception(info={})
if info[:type] == 'access violation' and b = find_bp_bpm(info)
info[:fault_len] ||= 1
b.internal.update info
return evt_bpm(b)
end
@state = :stopped
@info = "exception #{info[:type]}"
callback_exception[info] if callback_exception
pass = pass_all_exceptions
pass = pass[info] if pass.kind_of? Proc
if pass
pass_current_exception
resume_badbreak
end
end
def evt_newthread(info={})
@state = :stopped
@info = 'new thread'
callback_newthread[info] if callback_newthread
ign = ignore_newthread
ign = ign[info] if ign.kind_of? Proc
if ign
continue
end
end
def evt_endthread(info={})
@state = :stopped
@info = 'end thread'
# mark the thread as to be deleted on next check_pre_run
@dead_thread = true
callback_endthread[info] if callback_endthread
ign = ignore_endthread
ign = ign[info] if ign.kind_of? Proc
if ign
continue
end
end
def evt_newprocess(info={})
@state = :stopped
@info = 'new process'
callback_newprocess[info] if callback_newprocess
end
def evt_endprocess(info={})
@state = :stopped
@info = 'end process'
@dead_process = true
callback_endprocess[info] if callback_endprocess
end
def evt_loadlibrary(info={})
@state = :stopped
@info = 'loadlibrary'
callback_loadlibrary[info] if callback_loadlibrary
end
# called when we did break due to a breakpoint whose condition is invalid
# resume execution as if we never stopped
# disable offending bp + singlestep if needed
def resume_badbreak(b=nil)
# ensure we didn't delete b
if b and b.hash_shared.find { |bb| bb.state == :active }
rm = @run_method
if rm == :singlestep
singlestep_bp(b)
else
ra = @run_args
singlestep_bp(b) { send rm, *ra }
end
else
send @run_method, *@run_args
end
end
# singlesteps over an active breakpoint and run its block
# if the breakpoint provides an emulation stub, run that, otherwise
# disable the breakpoint, singlestep, and re-enable
def singlestep_bp(bp, &b)
if has_emul_instr(bp)
@state = :stopped
bp.emul_instr.call
b.call if b
else
bp.hash_shared.each { |bb|
disable_bp(bb, :temp_inactive) if bb.state == :active
}
# this *should* work with different bps stopping the current instr
prev_sscb = @singlestep_cb
singlestep {
bp.hash_shared.each { |bb|
enable_bp(bb) if bb.state == :temp_inactive
}
prev_sscb[] if prev_sscb
b.call if b
}
end
end
# checks if @breakpoint_cause is valid, or was obsoleted by the user changing pc
def check_breakpoint_cause
if bp = @breakpoint_cause and
(bp.type == :bpx or (bp.type == :hwbp and bp.internal[:type] == :x)) and
pc != bp.address
bp = @breakpoint_cause = nil
end
bp
end
# checks if the running target has stopped (nonblocking)
# returns false if no debug event happened
def check_target
do_check_target
end
# waits until the running target stops (due to a breakpoint, fault, etc)
def wait_target
do_wait_target while @state == :running
end
# resume execution of the target
# bypasses a software breakpoint on pc if needed
# thread breakpoints must be manually disabled before calling continue
def continue
if b = check_breakpoint_cause and b.hash_shared.find { |bb| bb.state == :active }
singlestep_bp(b) {
next if not check_pre_run(:continue)
do_continue
}
else
return if not check_pre_run(:continue)
do_continue
end
end
alias run continue
# continue ; wait_target
def continue_wait
continue
wait_target
end
# resume execution of the target one instruction at a time
def singlestep(&b)
@singlestep_cb = b
bp = check_breakpoint_cause
return if not check_pre_run(:singlestep)
if bp and bp.hash_shared.find { |bb| bb.state == :active } and has_emul_instr(bp)
@state = :stopped
bp.emul_instr.call
invalidate
evt_singlestep(true)
else
do_singlestep
end
end
# singlestep ; wait_target
def singlestep_wait(&b)
singlestep(&b)
wait_target
end
# tests if the specified instructions should be stepover() using singlestep or
# by putting a breakpoint at next_addr
def need_stepover(di = di_at(pc))
di and @cpu.dbg_need_stepover(self, di.address, di)
end
# stepover: singlesteps, but do not enter in subfunctions
def stepover
di = di_at(pc)
if need_stepover(di)
bpx di.next_addr, true, Expression[:tid, :==, @tid]
continue
else
singlestep
end
end
# stepover ; wait_target
def stepover_wait
stepover
wait_target
end
# checks if an instruction should stop the stepout() (eg it is a return instruction)
def end_stepout(di = di_at(pc))
di and @cpu.dbg_end_stepout(self, di.address, di)
end
# stepover until finding the last instruction of the function
def stepout
# TODO thread-local bps
while not end_stepout
stepover
wait_target
end
do_singlestep
end
def stepout_wait
stepout
wait_target
end
# set a singleshot breakpoint, run the process, and wait
def go(target, cond=nil)
bpx(target, true, cond)
continue_wait
end
# continue_wait until @state == :dead
def run_forever
continue_wait until @state == :dead
end
# decode the Instruction at the address, use the @disassembler cache if available
def di_at(addr)
@disassembler.di_at(addr) || @disassembler.disassemble_instruction(addr)
end
# list the general purpose register names available for the target
def register_list
@cpu.dbg_register_list
end
# hash { register_name => register_size_in_bits }
def register_size
@cpu.dbg_register_size
end
# retrieves the name of the register holding the program counter (address of the next instruction)
def register_pc
@cpu.dbg_register_pc
end
# retrieve the name of the register holding the stack pointer
def register_sp
@cpu.dbg_register_sp
end
# then name of the register holding the cpu flags
def register_flags
@cpu.dbg_register_flags
end
# list of flags available in the flag register
def flag_list
@cpu.dbg_flag_list
end
# retreive the value of the program counter register (eip)
def pc
get_reg_value(register_pc)
end
alias ip pc
# change the value of pc
def pc=(v)
set_reg_value(register_pc, v)
end
alias ip= pc=
# retrieve the value of the stack pointer register
def sp
get_reg_value(register_sp)
end
# update the stack pointer
def sp=(v)
set_reg_value(register_sp, v)
end
# retrieve the value of a flag (0/1)
def get_flag_value(f)
@cpu.dbg_get_flag(self, f)
end
# retrieve the value of a flag (true/false)
def get_flag(f)
get_flag_value(f) != 0
end
# change the value of a flag
def set_flag_value(f, v)
(v && v != 0) ? set_flag(f) : unset_flag(f)
end
# switch the value of a flag (true->false, false->true)
def toggle_flag(f)
set_flag_value(f, 1-get_flag_value(f))
end
# set the value of the flag to true
def set_flag(f)
@cpu.dbg_set_flag(self, f)
end
# set the value of the flag to false
def unset_flag(f)
@cpu.dbg_unset_flag(self, f)
end
# returns the name of the module containing addr or nil
def addr2module(addr)
@modulemap.keys.find { |k| @modulemap[k][0] <= addr and @modulemap[k][1] > addr }
end
# returns a string describing addr in term of symbol (eg 'libc.so.6!printf+2f')
def addrname(addr)
(addr2module(addr) || '???') + '!' +
if s = @symbols[addr] ? addr : @symbols_len.keys.find { |s_| s_ < addr and s_ + @symbols_len[s_] > addr }
@symbols[s] + (addr == s ? '' : ('+%x' % (addr-s)))
else '%08x' % addr
end
end
# same as addrname, but scan preceding addresses if no symbol matches
def addrname!(addr)
(addr2module(addr) || '???') + '!' +
if s = @symbols[addr] ? addr :
@symbols_len.keys.find { |s_| s_ < addr and s_ + @symbols_len[s_] > addr } ||
@symbols.keys.sort.find_all { |s_| s_ < addr and s_ + 0x10000 > addr }.max
@symbols[s] + (addr == s ? '' : ('+%x' % (addr-s)))
else '%08x' % addr
end
end
# loads the symbols from a mapped module
def loadsyms(addr, name='%08x'%addr.to_i)
if addr.kind_of? String
modules.each { |m|
if m.path =~ /#{addr}/i
addr = m.addr
name = File.basename m.path
break
end
}
return if not addr.kind_of? Integer
end
return if not peek = @memory.get_page(addr, 4)
if peek == "\x7fELF"
cls = LoadedELF
elsif peek[0, 2] == "MZ" and @memory[addr+@memory[addr+0x3c,4].unpack('V').first, 4] == "PE\0\0"
cls = LoadedPE
else return
end
begin
e = cls.load @memory[addr, 0x1000_0000]
e.load_address = addr
e.decode_header
e.decode_exports
rescue
# cache the error so that we dont hit it every time
@modulemap[addr.to_s(16)] ||= [addr, addr+0x1000]
return
end
if n = e.module_name and n != name
name = n
end
@modulemap[name] ||= [addr, addr+e.module_size]
cnt = 0
e.module_symbols.each { |n_, a, l|
cnt += 1
a += addr
@disassembler.set_label_at(a, n_, false)
@symbols[a] = n_ # XXX store "lib!sym" ?
if l and l > 1; @symbols_len[a] = l
else @symbols_len.delete a # we may overwrite an existing symbol, keep len in sync
end
}
log "loaded #{cnt} symbols from #{name}"
true
end
# scan the target memory for loaded libraries, load their symbols
def scansyms(addr=0, max=@memory.length-0x1000-addr)
while addr <= max
loadsyms(addr)
addr += 0x1000
end
end
# load symbols from all libraries found by the OS module
def loadallsyms(&b)
modules.each { |m|
b.call(m.addr) if b
loadsyms(m.addr, m.path)
}
end
# see Disassembler#load_map
def load_map(str, off=0)
str = File.read(str) if File.exist?(str)
sks = @disassembler.sections.keys.sort
str.each_line { |l|
case l.strip
when /^([0-9A-F]+)\s+(\w+)\s+(\w+)/i # kernel.map style
a = $1.to_i(16) + off
n = $3
when /^([0-9A-F]+):([0-9A-F]+)\s+([a-z_]\w+)/i # IDA style
# see Disassembler for comments
a = sks[$1.to_i(16)] + $2.to_i(16) + off
n = $3
else next
end
@disassembler.set_label_at(a, n, false)
@symbols[a] = n
}
end
# parses the expression contained in arg
def parse_expr(arg)
parse_expr!(arg.dup)
end
# parses the expression contained in arg, updates arg to point after the expr
def parse_expr!(arg, &b)
return if not e = IndExpression.parse_string!(arg) { |s|
# handle 400000 -> 0x400000
# XXX no way to override and force decimal interpretation..
if s.length > 4 and not @disassembler.get_section_at(s.to_i) and @disassembler.get_section_at(s.to_i(16))
s.to_i(16)
else
s.to_i
end
}
# resolve ambiguous symbol names/hex values
bd = {}
e.externals.grep(::String).each { |ex|
if not v = register_list.find { |r| ex.downcase == r.to_s.downcase } ||
(b && b.call(ex)) || symbols.index(ex)
lst = symbols.values.find_all { |s| s.downcase.include? ex.downcase }
case lst.length
when 0
if ex =~ /^[0-9a-f]+$/i and @disassembler.get_section_at(ex.to_i(16))
v = ex.to_i(16)
else
raise "unknown symbol name #{ex}"
end
when 1
v = symbols.index(lst.first)
log "using #{lst.first} for #{ex}"
else
suggest = lst[0, 50].join(', ')
suggest = suggest[0, 125] + '...' if suggest.length > 128
raise "ambiguous symbol name #{ex}: #{suggest} ?"
end
end
bd[ex] = v
}
e = e.bind(bd)
e
end
# resolves an expression involving register values and/or memory indirection using the current context
# uses #register_list, #get_reg_value, @mem, @cpu
# :tid/:pid resolve to current thread
def resolve_expr(e)
e = parse_expr(e) if e.kind_of? ::String
bd = { :tid => @tid, :pid => @pid }
Expression[e].externals.each { |ex|
next if bd[ex]
case ex
when ::Symbol; bd[ex] = get_reg_value(ex)
when ::String; bd[ex] = @symbols.index(ex) || @disassembler.prog_binding[ex] || 0
end
}
Expression[e].bind(bd).reduce { |i|
if i.kind_of? Indirection and p = i.pointer.reduce and p.kind_of? ::Integer
i.len ||= @cpu.size/8
p &= (1 << @cpu.size) - 1 if p < 0
Expression.decode_imm(@memory, i.len, @cpu, p)
end
}
end
alias resolve resolve_expr
# return/yield an array of [addr, addr symbolic name] corresponding to the current stack trace
def stacktrace(maxdepth=500, &b)
@cpu.dbg_stacktrace(self, maxdepth, &b)
end
# accepts a range or begin/end address to read memory, or a register name
def [](arg0, arg1=nil)
if arg1
arg0 = resolve_expr(arg0) if not arg0.kind_of? ::Integer
arg1 = resolve_expr(arg1) if not arg1.kind_of? ::Integer
@memory[arg0, arg1].to_str
elsif arg0.kind_of? ::Range
arg0.begin = resolve_expr(arg0.begin) if not arg0.begin.kind_of? ::Integer # cannot happen, invalid ruby Range
arg0.end = resolve_expr(arg0.end) if not arg0.end.kind_of? ::Integer
@memory[arg0].to_str
else
get_reg_value(arg0)
end
end
# accepts a range or begin/end address to write memory, or a register name
def []=(arg0, arg1, val=nil)
arg1, val = val, arg1 if not val
if arg1
arg0 = resolve_expr(arg0) if not arg0.kind_of? ::Integer
arg1 = resolve_expr(arg1) if not arg1.kind_of? ::Integer
@memory[arg0, arg1] = val
elsif arg0.kind_of? ::Range
arg0.begin = resolve_expr(arg0.begin) if not arg0.begin.kind_of? ::Integer # cannot happen, invalid ruby Range
arg0.end = resolve_expr(arg0.end) if not arg0.end.kind_of? ::Integer
@memory[arg0] = val
else
set_reg_value(arg0, val)
end
end
# read an int from the target memory, int of sz bytes (defaults to cpu.size)
def memory_read_int(addr, sz=@cpu.size/8)
addr = resolve_expr(addr) if not addr.kind_of? ::Integer
Expression.decode_imm(@memory, sz, @cpu, addr)
end
# write an int in the target memory
def memory_write_int(addr, val, sz=@cpu.size/8)
addr = resolve_expr(addr) if not addr.kind_of? ::Integer
val = resolve_expr(val) if not val.kind_of? ::Integer
@memory[addr, sz] = Expression.encode_imm(val, sz, @cpu)
end
# retrieve an argument (call at a function entrypoint)
def func_arg(nr)
@cpu.dbg_func_arg(self, nr)
end
def func_arg_set(nr, val)
@cpu.dbg_func_arg_set(self, nr, val)
end
# retrieve a function returned value (call at func exitpoint)
def func_retval
@cpu.dbg_func_retval(self)
end
def func_retval_set(val)
@cpu.dbg_func_retval_set(self, val)
end
def func_retval=(val)
@cpu.dbg_func_retval_set(self, val)
end
# retrieve a function return address (call at func entry/exit)
def func_retaddr
@cpu.dbg_func_retaddr(self)
end
def func_retaddr_set(addr)
@cpu.dbg_func_retaddr_set(self, addr)
end
def func_retaddr=(addr)
@cpu.dbg_func_retaddr_set(self, addr)
end
def load_plugin(plugin_filename)
if not File.exist?(plugin_filename) and defined? Metasmdir
# try autocomplete
pf = File.join(Metasmdir, 'samples', 'dbg-plugins', plugin_filename)
if File.exist?(pf)
plugin_filename = pf
elsif File.exist?(pf + '.rb')
plugin_filename = pf + '.rb'
end
end
if (not File.exist?(plugin_filename) or File.directory?(plugin_filename)) and File.exist?(plugin_filename + '.rb')
plugin_filename += '.rb'
end
instance_eval File.read(plugin_filename)
end
# return the list of memory mappings of the current process
# array of [start, len, perms, infos]
def mappings
[[0, @memory.length]]
end
# return a list of Process::Modules (with a #path, #addr) for the current process
def modules
[]
end
# list debugged pids
def list_debug_pids
@pid_stuff.keys | [@pid].compact
end
# return a list of OS::Process listing all alive processes (incl not debugged)
# default version only includes current debugged pids
def list_processes
list_debug_pids.map { |p| OS::Process.new(p) }
end
# check if pid is valid
def check_pid(pid)
list_processes.find { |p| p.pid == pid }
end
# list debugged tids
def list_debug_tids
@tid_stuff.keys | [@tid].compact
end
# list of thread ids existing in the current process (incl not debugged)
# default version only lists debugged tids
alias list_threads list_debug_tids
# check if tid is valid for the current process
def check_tid(tid)
list_threads.include?(tid)
end
# see EData#pattern_scan
# scans only mapped areas of @memory, using os_process.mappings
def pattern_scan(pat, start=0, len=@memory.length-start, &b)
ret = []
mappings.each { |maddr, mlen, perm, *o_|
next if perm !~ /r/i
mlen -= start-maddr if maddr < start
maddr = start if maddr < start
mlen = start+len-maddr if maddr+mlen > start+len
next if mlen <= 0
EncodedData.new(read_mapped_range(maddr, mlen)).pattern_scan(pat) { |off|
off += maddr
ret << off if not b or b.call(off)
}
}
ret
end
def read_mapped_range(addr, len)
# try to use a single get_page call
s = @memory.get_page(addr, len) || ''
s.length == len ? s : (s = @memory[addr, len] ? s.to_str : nil)
end
end
end
+4 -35
View File
@@ -134,10 +134,9 @@ class EncodedData
# bytes from rawsize to virtsize are returned as zeroes
# ignores self.relocations
def read(len=@virtsize-@ptr)
vlen = len
vlen = @virtsize-@ptr if len > @virtsize-@ptr
str = (@ptr < @data.length) ? @data[@ptr, vlen] : ''
str = str.to_str.ljust(vlen, "\0") if str.length < vlen
len = @virtsize-@ptr if len > @virtsize-@ptr
str = (@ptr < @data.length) ? @data[@ptr, len] : ''
str = str.to_str.ljust(len, "\0") if str.length < len
@ptr += len
str
end
@@ -183,7 +182,7 @@ class CPU
# returns a DecodedInstruction or nil
def decode_instruction(edata, addr)
@bin_lookaside ||= build_bin_lookaside
di = decode_findopcode edata if edata.ptr <= edata.length
di = decode_findopcode edata
di.address = addr if di
di = decode_instr_op(edata, di) if di
decode_instr_interpret(di, addr) if di
@@ -210,35 +209,5 @@ class CPU
def delay_slot(di=nil)
0
end
def disassembler_default_func
DecodedFunction.new
end
# return something like backtrace_binding in the forward direction
# set pc_reg to some reg name (eg :pc) to include effects on the instruction pointer
def get_fwdemu_binding(di, pc_reg=nil)
fdi = di.backtrace_binding ||= get_backtrace_binding(di)
fdi = fix_fwdemu_binding(di, fdi)
if pc_reg
if di.opcode.props[:setip]
xr = get_xrefs_x(nil, di)
if xr and xr.length == 1
fdi[pc_reg] = xr[0]
else
fdi[:incomplete_binding] = Expression[1]
end
else
fdi[pc_reg] = Expression[pc_reg, :+, di.bin_length]
end
end
fdi
end
# patch a forward binding from the backtrace binding
# useful only on specific instructions that update a register *and* dereference that register (eg push)
def fix_fwdemu_binding(di, fbd)
fbd
end
end
end
+15 -14
View File
@@ -69,7 +69,7 @@ class Decompiler
@c_parser.toplevel.symbol.delete func.name
decompile_func(entry)
@recurse = pre_recurse
if not @c_parser.toplevel.statements.grep(C::Declaration).find { |decl| decl.var.name == func.name }
if not dcl = @c_parser.toplevel.statements.grep(C::Declaration).find { |decl| decl.var.name == func.name }
@c_parser.toplevel.statements << C::Declaration.new(func)
end
end
@@ -208,7 +208,7 @@ class Decompiler
@c_parser.toplevel.statements.delete_if { |ts| ts.kind_of? C::Declaration and ts.var.name == name }
aoff = 1
ptype.args.to_a.each { |a|
aoff = (aoff + @c_parser.typesize[:ptr] - 1) / @c_parser.typesize[:ptr] * @c_parser.typesize[:ptr]
aoff = (aoff + @c_parser.typesize[:ptr] - 1) / @c_parser.typesize[:ptr] * @c_parser.typesize[:ptr]
f.decompdata[:stackoff_type][aoff] ||= a.type
f.decompdata[:stackoff_name][aoff] ||= a.name if a.name
aoff += sizeof(a) # ary ?
@@ -293,7 +293,7 @@ class Decompiler
@dasm.function[ta] = DecodedFunction.new
puts "autofunc #{Expression[ta]}" if $VERBOSE
end
if @dasm.function[ta] and type != :subfuncret
f = dasm.auto_label_at(ta, 'func')
ta = dasm.normalize($1) if f =~ /^thunk_(.*)/
@@ -350,7 +350,7 @@ class Decompiler
:include_start => i_s, :no_check => true, :terminals => [:frameptr])
if vals.length == 1 and ee = vals.first and (ee.kind_of? Expression and (ee == Expression[:frameptr] or
(ee.lexpr == :frameptr and ee.op == :+ and ee.rexpr.kind_of? ::Integer)))
ee
ee
else e
end
end
@@ -602,12 +602,12 @@ class Decompiler
when C::If
patch_test[ce.test]
if ce.bthen.kind_of? C::Block
case ce.bthen.statements.length
case ce.bthen.statements.length
when 1
walk(ce.bthen.statements) { |sst| sst.outer = ce.bthen.outer if sst.kind_of? C::Block and sst.outer == ce.bthen }
ce.bthen = ce.bthen.statements.first
when 0
if not ce.belse and i = ce.bthen.outer.statements.index(ce)
if not ce.belse and i = ce.bthen.outer.statements.index(ce)
ce.bthen.outer.statements[i] = ce.test # TODO remove sideeffectless parts
end
end
@@ -1521,7 +1521,7 @@ class Decompiler
tabidx = off / sizeof(st)
off -= tabidx * sizeof(st)
ptr = C::CExpression[:&, [ptr, :'[]', [tabidx]]] if tabidx != 0 or ptr.type.untypedef.kind_of? C::Array
return ptr if off == 0 and (not msz or # avoid infinite recursion with eg chained list
return ptr if off == 0 and (not msz or # avoid infinite recursion with eg chained list
(ptr.kind_of? C::CExpression and ((ptr.op == :& and not ptr.lexpr and s=ptr.rexpr) or (ptr.op == :'.' and s=ptr)) and
not s.type.untypedef.kind_of? C::Union))
@@ -1656,12 +1656,13 @@ class Decompiler
ce.rexpr = p if ce.rexpr == v1
}
}
}
end
# to be run with scope = function body with only CExpr/Decl/Label/Goto/IfGoto/Return, with correct variables types
# will transform += 1 to ++, inline them to prev/next statement ('++x; if (x)..' => 'if (++x)..')
# remove useless variables ('int i;', i never used or 'i = 1; j = i;', i never read after => 'j = 1;')
# remove useless variables ('int i;', i never used or 'i = 1; j = i;', i never read after => 'j = 1;')
# remove useless casts ('(int)i' with 'int i;' => 'i')
def optimize(scope)
optimize_code(scope)
@@ -1870,7 +1871,7 @@ class Decompiler
when ::Array; exp.any? { |_e| sideeffect _e, scope }
when C::Variable; (scope and not scope.symbol[exp.name]) or exp.type.qualifier.to_a.include? :volatile
when C::CExpression; (exp.op == :* and not exp.lexpr) or exp.op == :funcall or AssignOp.include?(exp.op) or
sideeffect(exp.lexpr, scope) or sideeffect(exp.rexpr, scope)
sideeffect(exp.lexpr, scope) or sideeffect(exp.rexpr, scope)
else true # failsafe
end
end
@@ -2008,7 +2009,7 @@ class Decompiler
}.compact
tw = to - [:write]
if to.include? :split or tw.length > 1
if to.include? :split or tw.length > 1
:split
elsif tw.length == 1
tw.first
@@ -2088,7 +2089,7 @@ class Decompiler
if (e.op == :'++' or e.op == :'--') and v = (e.lexpr || e.rexpr) and v.kind_of? C::Variable and
scope.symbol[v.name] and not v.type.qualifier.to_a.include? :volatile
next if !((pos = :post.to_sym) and (oe = find_next_read_bl[label, i, v]) and oe.kind_of? C::CExpression) and
!((pos = :prev.to_sym) and (oe = find_prev_read[label, i-2, v]) and oe.kind_of? C::CExpression)
!((pos = :prev.to_sym) and (oe = find_prev_read[label, i-2, v]) and oe.kind_of? C::CExpression)
next if oe.op == :& and not oe.lexpr # no &(++eax)
# merge pre/postincrement into next/prev var usage
@@ -2220,7 +2221,7 @@ class Decompiler
}
case cnt
when 0
break if bad
break if bad
next
when 1 # good
break if e.complexity > 10 and ce_.complexity > 3 # try to keep the C readable
@@ -2442,7 +2443,7 @@ class Decompiler
end
# compare type.type cause var is an Array and the cast is a Pointer
countderef[r.rexpr.name] += 1 if r.kind_of? C::CExpression and not r.op and r.rexpr.kind_of? C::Variable and
sizeof(nil, r.type.type) == sizeof(nil, r.rexpr.type.type) rescue nil
sizeof(nil, r.type.type) == sizeof(nil, r.rexpr.type.type) rescue nil
}
vars.each { |n|
if countref[n] == countderef[n]
@@ -2452,7 +2453,7 @@ class Decompiler
v.initializer = v.initializer.first if v.initializer.kind_of? ::Array
walk_ce(tl) { |ce|
if ce.op == :'->' and C::CExpression[ce.lexpr] == C::CExpression[v]
ce.op = :'.'
ce.op = :'.'
elsif ce.lexpr == target
ce.lexpr = v
end
+43 -195
View File
@@ -233,11 +233,6 @@ class DecodedFunction
attr_accessor :finalized
# bool, if true the function does not return (eg exit() or ExitProcess())
attr_accessor :noreturn
# hash stackoff => varname
# varname is a single String object shared by all ExpressionStrings (to allow renames)
attr_accessor :localvars
# hash stack offset => di address
attr_accessor :localvars_xrefs
# if btbind_callback is defined, calls it with args [dasm, binding, funcaddr, calladdr, expr, origin, maxdepth]
# else update lazily the binding from expr.externals, and return backtrace_binding
@@ -269,16 +264,6 @@ class DecodedFunction
@backtracked_for = []
@backtrace_binding = {}
end
def get_localvar_stackoff(off, di=nil, str=nil)
if di
@localvars_xrefs ||= {}
@localvars_xrefs[off] ||= []
@localvars_xrefs[off] |= [di.address]
end
@localvars ||= {}
@localvars[off] ||= (str || (off > 0 ? 'arg_%X' % off : 'var_%X' % -off))
end
end
class CPU
@@ -453,9 +438,7 @@ class Disassembler
when ::Integer
when ::String
raise "invalid section base #{base.inspect} - not at section start" if encoded.export[base] and encoded.export[base] != 0
if ed = get_edata_at(base)
ed.del_export(base)
end
raise "invalid section base #{base.inspect} - already seen at #{@prog_binding[base]}" if @prog_binding[base] and @prog_binding[base] != Expression[base]
encoded.add_export base, 0
else raise "invalid section base #{base.inspect} - expected string or integer"
end
@@ -468,7 +451,7 @@ class Disassembler
# update section_edata.reloc
# label -> list of relocs that refers to it
@inv_section_reloc ||= {}
@inv_section_reloc = {}
@sections.each { |b, e|
e.reloc.each { |o, r|
r.target.externals.grep(::String).each { |ext| (@inv_section_reloc[ext] ||= []) << [b, e, o, r] }
@@ -507,7 +490,7 @@ class Disassembler
# ignore relocs embedded in an already-listed instr
x << Xref.new(:reloc, addr) if not x.find { |x_|
next if not x_.origin or not di_at(x_.origin)
(addr - x_.origin) < @decoded[x_.origin].bin_length rescue false
(addr - x_.origin rescue 50) < @decoded[x_.origin].bin_length
}
}
end
@@ -522,18 +505,9 @@ class Disassembler
# parses a C string for function prototypes
def parse_c(str, filename=nil, lineno=1)
@c_parser_constcache = nil
@c_parser ||= @cpu.new_cparser
@c_parser.lexer.define_weak('__METASM__DECODE__')
@c_parser.parse(str, filename, lineno)
rescue ParseError
@c_parser.lexer.feed! ''
raise
end
# list the constants ([name, integer value]) defined in the C code (#define / enums)
def c_constants
@c_parser_constcache ||= @c_parser.numeric_constants
end
# returns the canonical form of addr (absolute address integer or label of start of section + section offset)
@@ -594,7 +568,6 @@ class Disassembler
end
# returns a hash associating addr => list of labels at this addr
# label_alias[a] may be nil if a new label is created elsewhere in the edata with the same name
def label_alias
if not @label_alias_cache
@label_alias_cache = {}
@@ -649,16 +622,17 @@ class Disassembler
if not f.finalized
f.finalized = true
puts " finalize subfunc #{Expression[addr]}" if debug_backtrace
backtrace_update_function_binding(addr, f)
@cpu.backtrace_update_function_binding(self, addr, f, f.return_address)
if not f.return_address
detect_function_thunk(addr)
end
end
@comment[addr] ||= []
bd = f.backtrace_binding.reject { |k, v| Expression[k] == Expression[v] or Expression[v] == Expression::Unknown }
unk = f.backtrace_binding.map { |k, v| k if v == Expression::Unknown }.compact
bd[unk.map { |u| Expression[u].to_s }.sort.join(',')] = Expression::Unknown if not unk.empty?
add_comment(addr, "function binding: " + bd.map { |k, v| "#{k} -> #{v}" }.sort.join(', '))
add_comment(addr, "function ends at " + f.return_address.map { |ra| Expression[ra] }.join(', ')) if f.return_address
@comment[addr] |= ["function binding: " + bd.map { |k, v| "#{k} -> #{v}" }.sort.join(', ')]
@comment[addr] |= ["function ends at " + f.return_address.map { |ra| Expression[ra] }.join(', ')] if f.return_address
}
end
@@ -684,7 +658,7 @@ puts " finalize subfunc #{Expression[addr]}" if debug_backtrace
next if not f = @function[subfunc] or f.finalized
f.finalized = true
puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
backtrace_update_function_binding(subfunc, f)
@cpu.backtrace_update_function_binding(self, subfunc, f, f.return_address)
if not f.return_address
detect_function_thunk(subfunc)
end
@@ -693,7 +667,7 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
if di = @decoded[addr]
if di.kind_of? DecodedInstruction
split_block(di.block, di.address, true) if not di.block_head? # this updates di.block
split_block(di.block, di.address) if not di.block_head? # this updates di.block
di.block.add_from(from, from_subfuncret ? :subfuncret : :normal) if from and from != :default
bf = di.block
elsif di == true
@@ -752,22 +726,20 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
end
# splits an InstructionBlock, updates the blocks backtracked_for
def split_block(block, address=nil, rebacktrace=false)
def split_block(block, address=nil)
if not address # invoked as split_block(0x401012)
return if not @decoded[block].kind_of? DecodedInstruction
block, address = @decoded[block].block, block
end
return block if address == block.address
new_b = block.split address
if rebacktrace
new_b.backtracked_for.dup.each { |btt|
backtrace(btt.expr, btt.address,
:only_upto => block.list.last.address,
:include_start => !btt.exclude_instr, :from_subfuncret => btt.from_subfuncret,
:origin => btt.origin, :orig_expr => btt.orig_expr, :type => btt.type, :len => btt.len,
:detached => btt.detached, :maxdepth => btt.maxdepth)
}
end
new_b.backtracked_for.dup.each { |btt|
backtrace(btt.expr, btt.address,
:only_upto => block.list.last.address,
:include_start => !btt.exclude_instr, :from_subfuncret => btt.from_subfuncret,
:origin => btt.origin, :orig_expr => btt.orig_expr, :type => btt.type, :len => btt.len,
:detached => btt.detached, :maxdepth => btt.maxdepth)
}
new_b
end
@@ -791,7 +763,8 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
each_xref(waddr, :w) { |x|
#next if off + x.len < 0
puts "W: disasm: self-modifying code at #{Expression[waddr]}" if $VERBOSE
add_comment(di_addr, "overwritten by #{@decoded[x.origin]}")
@comment[di_addr] ||= []
@comment[di_addr] |= ["overwritten by #{@decoded[x.origin]}"]
@callback_selfmodifying[di_addr] if callback_selfmodifying
return
}
@@ -802,7 +775,6 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
block.edata.ptr = di_addr - block.address + block.edata_ptr
if not di = @cpu.decode_instruction(block.edata, di_addr)
ed = block.edata
break if ed.ptr >= ed.length and get_section_at(di_addr) and di = block.list.last
puts "#{ed.ptr >= ed.length ? "end of section reached" : "unknown instruction #{ed.data[di_addr-block.address+block.edata_ptr, 4].to_s.unpack('H*')}"} at #{Expression[di_addr]}" if $VERBOSE
return
end
@@ -811,18 +783,7 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
block.add_di di
puts di if $DEBUG
if callback_newinstr
ndi = @callback_newinstr[di]
if not ndi or not ndi.block
block.list.delete di
if ndi
block.add_di ndi
ndi.bin_length = di.bin_length if ndi.bin_length == 0
@decoded[di_addr] = ndi
end
end
di = ndi
end
di = @callback_newinstr[di] if callback_newinstr
return if not di
block = di.block
@@ -832,7 +793,7 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
if not di_addr or di.opcode.props[:stopexec] or not @program.get_xrefs_x(self, di).empty?
# do not backtrace until delay slot is finished (eg MIPS: di is a
# ret and the delay slot holds stack fixup needed to calc func_binding)
# ret and the delay slot holds stack fixup needed to calc func_binding)
# XXX if the delay slot is also xref_x or :stopexec it is ignored
delay_slot ||= [di, @cpu.delay_slot(di)]
end
@@ -874,8 +835,6 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
@entrypoints |= entrypoints
entrypoints.each { |ep| do_disassemble_fast_deep(normalize(ep)) }
@callback_finished[] if callback_finished
end
def do_disassemble_fast_deep(ep)
@@ -937,7 +896,8 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
}
if func
auto_label_at(addr, 'sub', 'loc', 'xref')
@function[addr] = (@function[:default] || DecodedFunction.new).dup
# XXX use default_btbind_callback ?
@function[addr] = DecodedFunction.new
@function[addr].finalized = true
detect_function_thunk(addr)
puts "found new function #{get_label_at(addr)} at #{Expression[addr]}" if $VERBOSE
@@ -949,7 +909,7 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
# does not recurse into subfunctions
# assumes all :saveip returns, except those pointing to a subfunc with noreturn
# yields subfunction addresses (targets of :saveip)
# no backtrace for :x (change with backtrace_maxblocks_fast)
# only backtrace for :x with maxdepth 1 (ie handles only basic push+ret)
# returns a todo-style ary
# assumes @addrs_todo is empty
def disassemble_fast_block(block, &b)
@@ -967,7 +927,6 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
# decode instruction
block.edata.ptr = di_addr - block.address + block.edata_ptr
if not di = @cpu.decode_instruction(block.edata, di_addr)
break if block.edata.ptr >= block.edata.length and get_section_at(di_addr) and di = block.list.last
return ret
end
@@ -975,18 +934,7 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
block.add_di di
puts di if $DEBUG
if callback_newinstr
ndi = @callback_newinstr[di]
if not ndi or not ndi.block
block.list.delete di
if ndi
block.add_di ndi
ndi.bin_length = di.bin_length if ndi.bin_length == 0
@decoded[di_addr] = ndi
end
end
di = ndi
end
di = @callback_newinstr[di] if callback_newinstr
return ret if not di
di_addr = di.next_addr
@@ -994,9 +942,7 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
if di.opcode.props[:stopexec] or di.opcode.props[:setip]
if di.opcode.props[:setip]
@addrs_todo = []
ar = @program.get_xrefs_x(self, di)
ar = @callback_newaddr[di.address, ar] || ar if callback_newaddr
ar.each { |expr|
@program.get_xrefs_x(self, di).each { |expr|
backtrace(expr, di.address, :origin => di.address, :type => :x, :maxdepth => @backtrace_maxblocks_fast)
}
end
@@ -1019,13 +965,8 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
end
}
ar = [di_addr]
ar = @callback_newaddr[block.list.last.address, ar] || ar if callback_newaddr
ar.each { |a|
di.block.add_to_normal(a)
ret << [a, di.address]
}
ret
di.block.add_to_normal(di_addr)
ret << [di_addr, di.address]
end
# handles when disassemble_fast encounters a call to a subfunction
@@ -1096,7 +1037,7 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
count = 0
while b = block_at(addr)
count += 1
return if count > 5 or b.list.length > 5
return if count > 5 or b.list.length > 4
if b.to_subfuncret and not b.to_subfuncret.empty?
return if b.to_subfuncret.length != 1
addr = normalize(b.to_subfuncret.first)
@@ -1106,7 +1047,7 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
return if not btb = sf.backtrace_binding
btb = btb.dup
btb.delete_if { |k, v| Expression[k] == Expression[v] }
return if btb.length > 2 or btb.values.include? Expression::Unknown
return if btb.length > 2 or btb.values.include? Expression::Unknown
else
return if not bt = b.to_normal
if bt.include? :default
@@ -1350,88 +1291,6 @@ puts " finalize subfunc #{Expression[subfunc]}" if debug_backtrace
end
end
# iterates over all instructions of a function from a given entrypoint
# carries an object while walking, the object is yielded every instruction
# every block is walked only once, after all previous blocks are done (if possible)
# on a 'jz', a [:clone] event is yielded for every path beside the first
# on a juction (eg a -> b -> d, a -> c -> d), a [:merge] event occurs if froms have different objs
# event list:
# [:di, <addr>, <decoded_instruction>, <object>]
# [:clone, <newaddr>, <oldaddr>, <object>]
# [:merge, <newaddr>, {<oldaddr1> => <object1>, <oldaddr2> => <object2>, ...}, <object1>]
# [:subfunc, <subfunc_addr>, <call_addr>, <object>]
# all events should return an object
# :merge has a copy of object1 at the end so that uninterested callers can always return args[-1]
# if an event returns false, the trace stops for the current branch
def function_walk(addr_start, obj_start)
# addresses of instrs already seen => obj
done = {}
todo = [[addr_start, obj_start]]
while hop = todo.pop
addr, obj = hop
next if done.has_key?(done)
di = di_at(addr)
next if not di
if done.empty?
dilist = di.block.list[di.block.list.index(di)..-1]
else
# new block, check all 'from' have been seen
if not hop[2]
# may retry later
all_ok = true
di.block.each_from_samefunc(self) { |fa| all_ok = false unless done.has_key?(fa) }
if not all_ok
todo.unshift([addr, obj, true])
next
end
end
froms = {}
di.block.each_from_samefunc(self) { |fa| froms[fa] = done[fa] if done[fa] }
if froms.values.uniq.length > 1
obj = yield([:merge, addr, froms, froms.values.first])
next if obj == false
end
dilist = di.block.list
end
if dilist.each { |_di|
break if done.has_key?(_di.address) # looped back into addr_start
done[_di.address] = obj
obj = yield([:di, _di.address, _di, obj])
break if obj == false # also return false for the previous 'if'
}
from = dilist.last.address
if di.block.to_normal and di.block.to_normal[0] and
di.block.to_subfuncret and di.block.to_subfuncret[0]
# current instruction block calls into a subfunction
obj = di.block.to_normal.map { |subf|
yield([:subfunc, subf, from, obj])
}.first # propagate 1st subfunc result
next if obj == false
end
wantclone = false
di.block.each_to_samefunc(self) { |ta|
if wantclone
nobj = yield([:clone, ta, from, obj])
next if obj == false
todo << [ta, nobj]
else
todo << [ta, obj]
wantclone = true
end
}
end
end
end
# holds a backtrace result until a snapshot_addr is encountered
class StoppedExpr
attr_accessor :exprs
@@ -1497,7 +1356,7 @@ puts " not backtracking stack address #{expr}" if debug_backtrace
end
if vals = (no_check ? (!need_backtrace(expr, terminals) and [expr]) : backtrace_check_found(expr,
di, origin, type, len, maxdepth, detached, snapshot_addr))
di, origin, type, len, maxdepth, detached))
# no need to update backtracked_for
return vals
elsif maxdepth <= 0
@@ -1537,7 +1396,7 @@ puts " backtrace up #{Expression[h[:addr]]} #{oldexpr}#{" => #{expr}" if expr
if expr != oldexpr and not snapshot_addr and vals = (no_check ?
(!need_backtrace(expr, terminals) and [expr]) :
backtrace_check_found(expr, nil, origin, type, len,
maxdepth-h[:loopdetect].length, detached, snapshot_addr))
maxdepth-h[:loopdetect].length, detached))
result |= vals
next
end
@@ -1578,7 +1437,7 @@ puts " backtrace up #{Expression[h[:from]]}->#{Expression[h[:to]]} #{oldexpr}#
if expr != oldexpr and vals = (no_check ? (!need_backtrace(expr, terminals) and [expr]) :
backtrace_check_found(expr, @decoded[h[:from]], origin, type, len,
maxdepth-h[:loopdetect].length, detached, snapshot_addr))
maxdepth-h[:loopdetect].length, detached))
if snapshot_addr
expr = StoppedExpr.new vals
next expr
@@ -1639,7 +1498,7 @@ oldexpr = expr
when :func
expr = backtrace_emu_subfunc(h[:func], h[:funcaddr], h[:addr], expr, origin, maxdepth-h[:loopdetect].length)
if snapshot_addr and snapshot_addr == h[:funcaddr]
# XXX recursiveness detection needs to be fixed
# XXX recursiveness detection needs to be fixed
puts " backtrace: recursive function #{Expression[h[:funcaddr]]}" if debug_backtrace
next false
end
@@ -1647,7 +1506,7 @@ puts " backtrace: recursive function #{Expression[h[:funcaddr]]}" if debug_back
end
puts " backtrace #{h[:di] || Expression[h[:funcaddr]]} #{oldexpr} => #{expr}" if debug_backtrace and expr != oldexpr
if vals = (no_check ? (!need_backtrace(expr, terminals) and [expr]) : backtrace_check_found(expr,
h[:di], origin, type, len, maxdepth-h[:loopdetect].length, detached, snapshot_addr))
h[:di], origin, type, len, maxdepth-h[:loopdetect].length, detached))
if snapshot_addr
expr = StoppedExpr.new vals
else
@@ -1729,14 +1588,10 @@ puts " backtrace addrs_todo << #{Expression[retaddr]} from #{di} (funcret)" if
(ab = @address_binding[addr]) ? Expression[expr.bind(ab).reduce] : expr
end
def backtrace_update_function_binding(addr, func=@function[addr], retaddrs=func.return_address)
@cpu.backtrace_update_function_binding(self, addr, func, retaddrs)
end
# static resolution of indirections
def resolve(expr)
binding = Expression[expr].expr_indirections.inject(@old_prog_binding) { |binding_, ind|
e = get_edata_at(resolve(ind.target))
e, b = get_section_at(resolve(ind.target))
return expr if not e
binding_.merge ind => Expression[ e.decode_imm("u#{8*ind.len}".to_sym, @cpu.endianness) ]
}
@@ -1764,7 +1619,7 @@ puts " backtrace addrs_todo << #{Expression[retaddr]} from #{di} (funcret)" if
# TODO trace expr evolution through backtrace, to modify immediates to an expr involving label names
# TODO mov [ptr], imm ; <...> ; jmp [ptr] => rename imm as loc_XX
# eg. mov eax, 42 ; add eax, 4 ; jmp eax => mov eax, some_label-4
def backtrace_check_found(expr, di, origin, type, len, maxdepth, detached, snapshot_addr=nil)
def backtrace_check_found(expr, di, origin, type, len, maxdepth, detached)
# only entrypoints or block starts called by a :saveip are checked for being a function
# want to execute [esp] from a block start
if type == :x and di and di == di.block.list.first and @cpu.backtrace_is_function_return(expr, @decoded[origin]) and (
@@ -1794,14 +1649,11 @@ puts " backtrace addrs_todo << #{Expression[retaddr]} from #{di} (funcret)" if
end
return if need_backtrace(expr)
if snapshot_addr
return if expr.expr_externals(true).find { |ee| ee.kind_of?(Indirection) }
end
puts "backtrace #{type} found #{expr} from #{di} orig #{@decoded[origin] || Expression[origin] if origin}" if debug_backtrace
result = backtrace_value(expr, maxdepth)
# keep the ori pointer in the results to emulate volatile memory (eg decompiler prefers this)
#result << expr if not type # XXX returning multiple values for nothing is too confusing, TODO fix decompiler
result << expr if not type
result.uniq!
# create xrefs/labels
@@ -1843,7 +1695,7 @@ puts "backtrace #{type} found #{expr} from #{di} orig #{@decoded[origin] || Expr
ret = []
decode_imm = lambda { |addr, len|
edata = get_edata_at(addr)
edata, foo = get_section_at(addr)
if edata
Expression[ edata.decode_imm("u#{8*len}".to_sym, @cpu.endianness) ]
else
@@ -1951,7 +1803,7 @@ puts " backtrace_indirection for #{ind.target} failed: #{ev}" if debug_backtra
# TODO trace expression evolution to allow handling of
# mov eax, 28 ; add eax, 4 ; jmp eax
# => mov eax, (loc_xx-4)
if di and not unk and expr != n # and di.address == origin
if di and not unk # and di.address == origin
@cpu.replace_instr_arg_immediate(di.instruction, expr, n)
end
if @decoded[origin] and not unk
@@ -1998,10 +1850,6 @@ puts " backtrace_indirection for #{ind.target} failed: #{ev}" if debug_backtra
end
end
def inspect
"<Metasm::Disassembler @%x>" % object_id
end
def to_s
a = ''
dump { |l| a << l << "\n" }
@@ -2068,7 +1916,7 @@ puts " backtrace_indirection for #{ind.target} failed: #{ev}" if debug_backtra
if not xr.empty?
b["\n// Xrefs: #{xr[0, 8].join(' ')}#{' ...' if xr.length > 8}"]
end
if block.edata.inv_export[block.edata_ptr] and label_alias[block.address]
if block.edata.inv_export[block.edata_ptr]
b["\n"] if xr.empty?
label_alias[block.address].each { |name| b["#{name}:"] }
end
@@ -2085,8 +1933,8 @@ puts " backtrace_indirection for #{ind.target} failed: #{ev}" if debug_backtra
# TODO array-style data access
def dump_data(addr, edata, off, &b)
b ||= lambda { |l| puts l }
if l = edata.inv_export[off] and label_alias[addr]
l_list = label_alias[addr].sort
if l = edata.inv_export[off]
l_list = label_alias[addr].to_a.sort
l = l_list.pop || l
l_list.each { |ll|
b["#{ll}:"]
+89 -652
View File
@@ -99,28 +99,6 @@ class InstructionBlock
yield to if type == :indirect or dasm.function[to] or not dasm.decoded[to]
}
end
# returns the array used in each_from_samefunc
def from_samefunc(dasm)
ary = []
each_from_samefunc(dasm) { |a| ary << a }
ary
end
def from_otherfunc(dasm)
ary = []
each_from_otherfunc(dasm) { |a| ary << a }
ary
end
def to_samefunc(dasm)
ary = []
each_to_samefunc(dasm) { |a| ary << a }
ary
end
def to_otherfunc(dasm)
ary = []
each_to_otherfunc(dasm) { |a| ary << a }
ary
end
end
class DecodedInstruction
@@ -133,6 +111,44 @@ end
class CPU
# compat alias, for scripts using older version of metasm
def get_backtrace_binding(di) backtrace_binding(di) end
# return something like backtrace_binding in the forward direction
# set pc_reg to some reg name (eg :pc) to include effects on the instruction pointer
def get_fwdemu_binding(di, pc_reg=nil)
fdi = di.backtrace_binding ||= get_backtrace_binding(di)
# find self-updated regs & revert them in simultaneous affectations
# XXX handles only a <- a+i for now, this covers all useful cases (except imul eax, eax, 42 jz foobar)
fdi.keys.grep(::Symbol).each { |s|
val = Expression[fdi[s]]
next if val.lexpr != s or (val.op != :+ and val.op != :-) #or not val.rexpr.kind_of? ::Integer
fwd = { s => val }
inv = { s => val.dup }
inv[s].op = ((inv[s].op == :+) ? :- : :+)
nxt = {}
fdi.each { |k, v|
if k == s
nxt[k] = v
else
k = k.bind(fwd).reduce_rec if k.kind_of? Indirection
nxt[k] = Expression[Expression[v].bind(inv).reduce_rec]
end
}
fdi = nxt
}
if pc_reg
if di.opcode.props[:setip]
xr = get_xrefs_x(nil, di)
if xr and xr.length == 1
fdi[pc_reg] = xr[0]
else
fdi[:incomplete_binding] = Expression[1]
end
else
fdi[pc_reg] = Expression[pc_reg, :+, di.bin_length]
end
end
fdi
end
end
class Disassembler
@@ -140,16 +156,11 @@ class Disassembler
def self.backtrace_maxblocks ; @@backtrace_maxblocks ; end
def self.backtrace_maxblocks=(b) ; @@backtrace_maxblocks = b ; end
# adds a commentary at the given address
# comments are found in the array @comment: {addr => [list of strings]}
def add_comment(addr, cmt)
@comment[addr] ||= []
@comment[addr] |= [cmt]
end
# returns the 1st element of #get_section_at (ie the edata at a given address) or nil
def get_edata_at(*a)
if s = get_section_at(*a)
# returns the dasm section's edata containing addr
# its #ptr points to addr
# returns the 1st element of #get_section_at
def get_edata_at(addr)
if s = get_section_at(addr)
s[0]
end
end
@@ -198,12 +209,12 @@ class Disassembler
# yields every InstructionBlock
# returns the list of IBlocks
def each_instructionblock(&b)
def each_instructionblock
ret = []
@decoded.each { |addr, di|
next if not di.kind_of? DecodedInstruction or not di.block_head?
ret << di.block
b.call(di.block) if b
yield di.block if block_given?
}
ret
end
@@ -282,19 +293,18 @@ class Disassembler
# returns the label associated to an addr, or nil if none exist
def get_label_at(addr)
e = get_edata_at(addr, false)
e, b = get_section_at(addr, false)
e.inv_export[e.ptr] if e
end
# sets the label for the specified address
# returns nil if the address is not mapped
# memcheck is passed to get_section_at to validate that the address is mapped
# keep existing label if 'overwrite' is false
def set_label_at(addr, name, memcheck=true, overwrite=true)
def set_label_at(addr, name, memcheck=true)
addr = Expression[addr].reduce
e, b = get_section_at(addr, memcheck)
if not e
elsif not l = e.inv_export[e.ptr] or (!overwrite and l != name)
elsif not l = e.inv_export[e.ptr]
l = @program.new_label(name)
e.add_export l, e.ptr
@label_alias_cache = nil
@@ -307,7 +317,7 @@ class Disassembler
# remove a label at address addr
def del_label_at(addr, name=get_label_at(addr))
ed = get_edata_at(addr)
ed, b = get_section_at(addr)
if ed and ed.inv_export[ed.ptr]
ed.del_export name, ed.ptr
@label_alias_cache = nil
@@ -315,7 +325,6 @@ class Disassembler
each_xref(addr) { |xr|
next if not xr.origin or not o = @decoded[xr.origin] or not o.kind_of? Renderable
o.each_expr { |e|
next unless e.kind_of?(Expression)
e.lexpr = addr if e.lexpr == name
e.rexpr = addr if e.rexpr == name
}
@@ -328,14 +337,12 @@ class Disassembler
# returns the new label
# the new label must be program-uniq (see @program.new_label)
def rename_label(old, new)
return new if old == new
raise "label #{new.inspect} exists" if @prog_binding[new]
each_xref(normalize(old)) { |x|
next if not di = @decoded[x.origin]
@cpu.replace_instr_arg_immediate(di.instruction, old, new)
di.comment.to_a.each { |c| c.gsub!(old, new) }
}
e = get_edata_at(old, false)
e, l = get_section_at(old, false)
if e
e.add_export new, e.export.delete(old), true
end
@@ -492,12 +499,12 @@ class Disassembler
# if from..to spans multiple blocks
# to.block is splitted after to
# all path from from are replaced by a single link to after 'to', be careful !
# (eg a->b->... & a->c ; from in a, to in c => a->b is lost)
# (eg a->b->... & a->c ; from in a, to in c => a->b is lost)
# all instructions are stuffed in the first block
# paths are only walked using from/to_normal
# 'by' may be empty
# returns the block containing the new instrs (nil if empty)
def replace_instrs(from, to, by, patch_by=false)
def replace_instrs(from, to, by)
raise 'bad from' if not fdi = di_at(from) or not fdi.block.list.index(fdi)
raise 'bad to' if not tdi = di_at(to) or not tdi.block.list.index(tdi)
@@ -513,28 +520,14 @@ class Disassembler
wantlen -= by.grep(DecodedInstruction).inject(0) { |len, di| len + di.bin_length }
ldi = by.last
ldi = DecodedInstruction.new(ldi) if ldi.kind_of? Instruction
nb_i = by.grep(Instruction).length
wantlen = nb_i if wantlen < 0 or (ldi and ldi.opcode.props[:setip])
if patch_by
by.map! { |di|
if di.kind_of? Instruction
di = DecodedInstruction.new(di)
wantlen -= di.bin_length = wantlen / by.grep(Instruction).length
nb_i -= 1
end
di
}
else
by = by.map { |di|
if di.kind_of? Instruction
di = DecodedInstruction.new(di)
wantlen -= (di.bin_length = wantlen / nb_i)
nb_i -= 1
end
di
}
end
wantlen = by.grep(Instruction).length if wantlen < 0 or (ldi and ldi.opcode.props[:setip])
by.map! { |di|
if di.kind_of? Instruction
di = DecodedInstruction.new(di)
wantlen -= di.bin_length = wantlen / by.grep(Instruction).length
end
di
}
#puts " ** patch next_addr to #{Expression[tb.list.last.next_addr]}" if not by.empty? and by.last.opcode.props[:saveip]
by.last.next_addr = tb.list.last.next_addr if not by.empty? and by.last.opcode.props[:saveip]
@@ -656,8 +649,8 @@ class Disassembler
if b1 and not b1.kind_of? InstructionBlock
return if not b1 = block_at(b1)
end
if b2 and not b2.kind_of? InstructionBlock
return if not b2 = block_at(b2)
if b2 and not b2.kind_of? InstructionBlock
return if not b2 = block_at(b2)
end
if b1 and b2 and (allow_nonadjacent or b1.list.last.next_addr == b2.address) and
b1.to_normal.to_a == [b2.address] and b2.from_normal.to_a.length == 1 and # that handles delay_slot
@@ -727,23 +720,17 @@ class Disassembler
end
# returns a demangled C++ name
def demangle_cppname(name)
case name[0]
when ?? # MSVC
name = name[1..-1]
demangle_msvc(name[1..-1]) if name[0] == ??
when ?_
name = name.sub(/_GLOBAL__[ID]_/, '')
demangle_gcc(name[2..-1][/\S*/]) if name[0, 2] == '_Z'
end
end
# from wgcc-2.2.2/undecorate.cpp
# TODO
def demangle_msvc(name)
op = name[0, 1]
op = name[0, 2] if op == '_'
if op = {
def demangle_cppname(name)
ret = name
if name[0] == ??
name = name[1..-1]
if name[0] == ??
name = name[1..-1]
op = name[0, 1]
op = name[0, 2] if op == '_'
if op = {
'2' => "new", '3' => "delete", '4' => "=", '5' => ">>", '6' => "<<", '7' => "!", '8' => "==", '9' => "!=",
'A' => "[]", 'C' => "->", 'D' => "*", 'E' => "++", 'F' => "--", 'G' => "-", 'H' => "+", 'I' => "&",
'J' => "->*", 'K' => "/", 'L' => "%", 'M' => "<", 'N' => "<=", 'O' => ">", 'P' => ">=", 'Q' => ",",
@@ -756,157 +743,11 @@ class Disassembler
'_M' => "`eh vector destructor iterator'", '_N' => "`eh vector vbase constructor iterator'", '_O' => "`copy constructor closure'",
'_S' => "`local vftable'", '_T' => "`local vftable constructor closure'", '_U' => "new[]", '_V' => "delete[]",
'_X' => "`placement delete closure'", '_Y' => "`placement delete[] closure'"}[op]
op[0] == ?` ? op[1..-2] : "op_#{op}"
end
end
# from http://www.codesourcery.com/public/cxx-abi/abi.html
def demangle_gcc(name)
subs = []
ret = ''
decode_tok = lambda {
name ||= ''
case name[0]
when nil
ret = nil
when ?N
name = name[1..-1]
decode_tok[]
until name[0] == ?E
break if not ret
ret << '::'
decode_tok[]
end
name = name[1..-1]
when ?I
name = name[1..-1]
ret = ret[0..-3] if ret[-2, 2] == '::'
ret << '<'
decode_tok[]
until name[0] == ?E
break if not ret
ret << ', '
decode_tok[]
end
ret << ' ' if ret and ret[-1] == ?>
ret << '>' if ret
name = name[1..-1]
when ?T
case name[1]
when ?T; ret << 'vtti('
when ?V; ret << 'vtable('
when ?I; ret << 'typeinfo('
when ?S; ret << 'typename('
else ret = nil
end
name = name[2..-1].to_s
decode_tok[] if ret
ret << ')' if ret
name = name[1..-1] if name[0] == ?E
when ?C
name = name[2..-1]
base = ret[/([^:]*)(<.*|::)?$/, 1]
ret << base
when ?D
name = name[2..-1]
base = ret[/([^:]*)(<.*|::)?$/, 1]
ret << '~' << base
when ?0..?9
nr = name[/^[0-9]+/]
name = name[nr.length..-1].to_s
ret << name[0, nr.to_i]
name = name[nr.to_i..-1]
subs << ret[/[\w:]*$/]
when ?S
name = name[1..-1]
case name[0]
when ?_, ?0..?9, ?A..?Z
case name[0]
when ?_; idx = 0 ; name = name[1..-1]
when ?0..?9; idx = name[0, 1].unpack('C')[0] - 0x30 + 1 ; name = name[2..-1]
when ?A..?Z; idx = name[0, 1].unpack('C')[0] - 0x41 + 11 ; name = name[2..-1]
end
if not subs[idx]
ret = nil
else
ret << subs[idx]
end
when ?t
ret << 'std::'
name = name[1..-1]
decode_tok[]
else
std = { ?a => 'std::allocator',
?b => 'std::basic_string',
?s => 'std::string', # 'std::basic_string < char, std::char_traits<char>, std::allocator<char> >',
?i => 'std::istream', # 'std::basic_istream<char, std::char_traits<char> >',
?o => 'std::ostream', # 'std::basic_ostream<char, std::char_traits<char> >',
?d => 'std::iostream', # 'std::basic_iostream<char, std::char_traits<char> >'
}[name[0]]
if not std
ret = nil
else
ret << std
end
name = name[1..-1]
end
when ?P, ?R, ?r, ?V, ?K
attr = { ?P => '*', ?R => '&', ?r => ' restrict', ?V => ' volatile', ?K => ' const' }[name[0]]
name = name[1..-1]
rl = ret.length
decode_tok[]
if ret
ret << attr
subs << ret[rl..-1]
end
else
if ret =~ /[(<]/ and ty = {
?v => 'void', ?w => 'wchar_t', ?b => 'bool', ?c => 'char', ?a => 'signed char',
?h => 'unsigned char', ?s => 'short', ?t => 'unsigned short', ?i => 'int',
?j => 'unsigned int', ?l => 'long', ?m => 'unsigned long', ?x => '__int64',
?y => 'unsigned __int64', ?n => '__int128', ?o => 'unsigned __int128', ?f => 'float',
?d => 'double', ?e => 'long double', ?g => '__float128', ?z => '...'
}[name[0]]
name = name[1..-1]
ret << ty
else
fu = name[0, 2]
name = name[2..-1]
if op = {
'nw' => ' new', 'na' => ' new[]', 'dl' => ' delete', 'da' => ' delete[]',
'ps' => '+', 'ng' => '-', 'ad' => '&', 'de' => '*', 'co' => '~', 'pl' => '+',
'mi' => '-', 'ml' => '*', 'dv' => '/', 'rm' => '%', 'an' => '&', 'or' => '|',
'eo' => '^', 'aS' => '=', 'pL' => '+=', 'mI' => '-=', 'mL' => '*=', 'dV' => '/=',
'rM' => '%=', 'aN' => '&=', 'oR' => '|=', 'eO' => '^=', 'ls' => '<<', 'rs' => '>>',
'lS' => '<<=', 'rS' => '>>=', 'eq' => '==', 'ne' => '!=', 'lt' => '<', 'gt' => '>',
'le' => '<=', 'ge' => '>=', 'nt' => '!', 'aa' => '&&', 'oo' => '||', 'pp' => '++',
'mm' => '--', 'cm' => ',', 'pm' => '->*', 'pt' => '->', 'cl' => '()', 'ix' => '[]',
'qu' => '?', 'st' => ' sizeof', 'sz' => ' sizeof', 'at' => ' alignof', 'az' => ' alignof'
}[fu]
ret << "operator#{op}"
elsif fu == 'cv'
ret << "cast<"
decode_tok[]
ret << ">" if ret
else
ret = nil
end
ret = op[0] == ?` ? op[1..-2] : "op_#{op}"
end
end
name ||= ''
}
decode_tok[]
subs.pop
if ret and name != ''
ret << '('
decode_tok[]
while ret and name != ''
ret << ', '
decode_tok[]
end
ret << ')' if ret
end
# TODO
ret
end
@@ -914,8 +755,7 @@ class Disassembler
# return/yields all the addresses matching
# if yield returns nil/false, do not include the addr in the final result
# sections are scanned MB by MB, so this should work (slowly) on 4GB sections (eg debugger VM)
# with addr_start/length, symbol-based section are skipped
def pattern_scan(pat, addr_start=nil, length=nil, chunksz=nil, margin=nil, &b)
def pattern_scan(pat, chunksz=nil, margin=nil)
chunksz ||= 4*1024*1024 # scan 4MB at a time
margin ||= 65536 # add this much bytes at each chunk to find /pat/ over chunk boundaries
@@ -923,27 +763,9 @@ class Disassembler
found = []
@sections.each { |sec_addr, e|
if addr_start
length ||= 0x1000_0000
begin
if sec_addr < addr_start
next if sec_addr+e.length <= addr_start
e = e[addr_start-sec_addr, e.length]
sec_addr = addr_start
end
if sec_addr+e.length > addr_start+length
next if sec_addr > addr_start+length
e = e[0, sec_addr+e.length-(addr_start+length)]
end
rescue
puts $!, $!.message, $!.backtrace if $DEBUG
# catch arithmetic error with symbol-based section
next
end
end
e.pattern_scan(pat, chunksz, margin) { |eo|
match_addr = sec_addr + eo
found << match_addr if not b or b.call(match_addr)
found << match_addr if not block_given? or yield(match_addr)
false
}
}
@@ -951,14 +773,14 @@ class Disassembler
end
# returns/yields [addr, string] found using pattern_scan /[\x20-\x7e]/
def strings_scan(minlen=6, &b)
def strings_scan(minlen=6)
ret = []
nexto = 0
pattern_scan(/[\x20-\x7e]{#{minlen},}/m, nil, 1024) { |o|
pattern_scan(/[\x20-\x7e]{#{minlen},}/nm, nil, 1024) { |o|
if o - nexto > 0
next unless e = get_edata_at(o)
str = e.data[e.ptr, 1024][/[\x20-\x7e]{#{minlen},}/m]
ret << [o, str] if not b or b.call(o, str)
str = e.data[e.ptr, 1024][/[\x20-\x7e]{#{minlen},}/nm]
ret << [o, str] if not block_given? or yield(o, str)
nexto = o + str.length
end
}
@@ -983,23 +805,18 @@ class Disassembler
def load_map(str, off=0)
str = File.read(str) rescue nil if not str.index("\n")
sks = @sections.keys.sort
seen = {}
str.each_line { |l|
case l.strip
when /^([0-9A-F]+)\s+(\w+)\s+(\w+)/i # kernel.map style
addr = $1.to_i(16)+off
set_label_at(addr, $3, false, !seen[addr])
seen[addr] = true
set_label_at($1.to_i(16)+off, $3)
when /^([0-9A-F]+):([0-9A-F]+)\s+([a-z_]\w+)/i # IDA style
# we do not have section load order, let's just hope that the addresses are sorted (and sortable..)
# could check the 1st part of the file, with section sizes, but it is not very convenient
# the regexp is so that we skip the 1st part with section descriptions
# in the file, section 1 is the 1st section ; we have an additionnal section (exe header) which fixes the 0-index
addr = sks[$1.to_i(16)] + $2.to_i(16) + off
set_label_at(addr, $3, false, !seen[addr])
seen[addr] = true
set_label_at(sks[$1.to_i(16)] + $2.to_i(16) + off, $3)
end
}
}
end
# saves the dasm state in a file
@@ -1013,14 +830,13 @@ class Disassembler
def save_io(fd)
fd.puts 'Metasm.dasm'
if @program.filename and not @program.kind_of?(Shellcode)
if @program.filename
t = @program.filename.to_s
fd.puts "binarypath #{t.length}", t
else
t = "#{@cpu.class.name.sub(/.*::/, '')} #{@cpu.size} #{@cpu.endianness}"
fd.puts "cpu #{t.length}", t
# XXX will be reloaded as a Shellcode with this CPU, but it may be a custom EXE
# do not output binarypath, we'll be loaded as a Shellcode, 'section' will suffice
end
@sections.each { |a, e|
@@ -1126,7 +942,6 @@ class Disassembler
reinitialize Shellcode.new(cpu)
@program.disassembler = self
@program.init_disassembler
@sections.delete(0) # rm empty section at 0, other real 'section' follow
when 'section'
info = data[0, data.index("\n") || data.length]
data = data[info.length, data.length]
@@ -1215,7 +1030,7 @@ class Disassembler
len = (len != '' ? len.to_i : nil)
o = (o.to_s != '' ? Expression.parse(pp.feed!(o)).reduce : nil) # :default/:unknown ?
add_xref(a, Xref.new(t, o, len))
rescue
rescue
puts "load: bad xref #{l.inspect} #$!" if $VERBOSE
end
}
@@ -1289,354 +1104,12 @@ class Disassembler
delta
end
# dataflow method
# walks a function, starting at addr
# follows the usage of registers, computing the evolution from the value they had at start_addr
# whenever an instruction references the register (or anything derived from it),
# yield [di, used_register, reg_value, trace_state] where reg_value is the Expression holding the value of
# the register wrt the initial value at start_addr, and trace_state the value of all registers (reg_value
# not yet applied)
# reg_value may be nil if used_register is not modified by the function (eg call [eax])
# the yield return value is propagated, unless it is nil/false
# init_state is a hash { :reg => initial value }
def trace_function_register(start_addr, init_state)
function_walk(start_addr, init_state) { |args|
trace_state = args.last
case args.first
when :di
di = args[2]
update = {}
get_fwdemu_binding(di).each { |r, v|
if v.kind_of?(Expression) and v.externals.find { |e| trace_state[e] }
# XXX may mix old (from trace) and current (from v) registers
newv = v.bind(trace_state)
update[r] = yield(di, r, newv, trace_state)
elsif r.kind_of?(ExpressionType) and rr = r.externals.find { |e| trace_state[e] }
# reg dereferenced in a write (eg mov [esp], 42)
next if update.has_key?(rr) # already yielded
if yield(di, rr, trace_state[rr], trace_state) == false
update[rr] = false
end
elsif trace_state[r]
# started on mov reg, foo
next if di.address == start_addr
update[r] = false
end
}
# directly walk the instruction argument list for registers not appearing in the binding
@cpu.instr_args_memoryptr(di).each { |ind|
b = @cpu.instr_args_memoryptr_getbase(ind)
if b and b = b.symbolic and not update.has_key?(b)
yield(di, b, nil, trace_state)
end
}
@cpu.instr_args_regs(di).each { |r|
r = r.symbolic
if not update.has_key?(r)
yield(di, r, nil, trace_state)
end
}
update.each { |r, v|
trace_state = trace_state.dup
if v
# cannot follow non-registers, or we would have to emulate every single
# instruction (try following [esp+4] across a __stdcall..)
trace_state[r] = v if r.kind_of?(::Symbol)
else
trace_state.delete r
end
}
when :subfunc
faddr = args[1]
f = @function[faddr]
f = @function[f.backtrace_binding[:thunk]] if f and f.backtrace_binding[:thunk]
if f
binding = f.backtrace_binding
if binding.empty?
backtrace_update_function_binding(faddr)
binding = f.backtrace_binding
end
# XXX fwdemu_binding ?
binding.each { |r, v|
if v.externals.find { |e| trace_state[e] }
if r.kind_of?(::Symbol)
trace_state = trace_state.dup
trace_state[r] = Expression[v.bind(trace_state)].reduce
end
elsif trace_state[r]
trace_state = trace_state.dup
trace_state.delete r
end
}
end
when :merge
# when merging paths, keep the smallest common state subset
# XXX may have unexplored froms
conflicts = args[2]
trace_state = trace_state.dup
conflicts.each { |addr, st|
trace_state.delete_if { |k, v| st[k] != v }
}
end
trace_state = false if trace_state.empty?
trace_state
}
end
# define a register as a pointer to a structure
# rename all [reg+off] as [reg+struct.member] in current function
# also trace assignments of pointer members
def trace_update_reg_structptr(addr, reg, structname, structoff=0)
sname = soff = ctx = nil
expr_to_sname = lambda { |expr|
if not expr.kind_of?(Expression) or expr.op != :+
sname = nil
next
end
sname = expr.lexpr || expr.rexpr
soff = (expr.lexpr ? expr.rexpr : 0)
if soff.kind_of?(Expression)
# ignore index in ptr array
if soff.op == :* and soff.lexpr == @cpu.size/8
soff = 0
elsif soff.rexpr.kind_of?(Expression) and soff.rexpr.op == :* and soff.rexpr.lexpr == @cpu.size/8
soff = soff.lexpr
elsif soff.lexpr.kind_of?(Expression) and soff.lexpr.op == :* and soff.lexpr.lexpr == @cpu.size/8
soff = soff.rexpr
end
elsif soff.kind_of?(::Symbol)
# array with 1 byte elements / pre-scaled idx?
if not ctx[soff]
soff = 0
end
end
}
lastdi = nil
trace_function_register(addr, reg => Expression[structname, :+, structoff]) { |di, r, val, trace|
next if r.to_s =~ /flag/ # XXX maybe too ia32-specific?
ctx = trace
@cpu.instr_args_memoryptr(di).each { |ind|
# find the structure dereference in di
b = @cpu.instr_args_memoryptr_getbase(ind)
b = b.symbolic if b
next unless trace[b]
imm = @cpu.instr_args_memoryptr_getoffset(ind) || 0
# check expr has the form 'traced_struct_reg + off'
expr_to_sname[trace[b] + imm] # Expr#+ calls Expr#reduce
next unless sname.kind_of?(::String) and soff.kind_of?(::Integer)
next if not st = c_parser.toplevel.struct[sname] or not st.kind_of?(C::Union)
# ignore lea esi, [esi+0]
next if soff == 0 and not di.backtrace_binding.find { |k, v| v-k != 0 }
# TODO if trace[b] offset != 0, we had a lea reg, [struct+substruct_off], tweak str accordingly
# resolve struct + off into struct.membername
str = st.name.dup
mb = st.expand_member_offset(c_parser, soff, str)
# patch di
imm = imm.rexpr if imm.kind_of?(Expression) and not imm.lexpr and imm.rexpr.kind_of?(ExpressionString)
imm = imm.expr if imm.kind_of?(ExpressionString)
@cpu.instr_args_memoryptr_setoffset(ind, ExpressionString.new(imm, str, :structoff))
# check if the type is an enum/bitfield, patch instruction immediates
trace_update_reg_structptr_arg_enum(di, ind, mb, str) if mb
} if lastdi != di.address
lastdi = di.address
next Expression[structname, :+, structoff] if di.address == addr and r == reg
# check if we need to trace 'r' further
val = val.reduce_rec if val.kind_of?(Expression)
val = Expression[val] if val.kind_of?(::String)
case val
when Expression
# only trace trivial structptr+off expressions
expr_to_sname[val]
if sname.kind_of?(::String) and soff.kind_of?(::Integer)
Expression[sname, :+, soff]
end
when Indirection
# di is mov reg, [ptr+struct.offset]
# check if the target member is a pointer to a struct, if so, trace it
expr_to_sname[val.pointer.reduce]
next unless sname.kind_of?(::String) and soff.kind_of?(::Integer)
if st = c_parser.toplevel.struct[sname] and st.kind_of?(C::Union)
pt = st.expand_member_offset(c_parser, soff, '')
pt = pt.untypedef if pt
if pt.kind_of?(C::Pointer)
tt = pt.type.untypedef
stars = ''
while tt.kind_of?(C::Pointer)
stars << '*'
tt = tt.type.untypedef
end
if tt.kind_of?(C::Union) and tt.name
Expression[tt.name + stars]
end
end
elsif soff == 0 and sname[-1] == ?*
# XXX pointer to pointer to struct
# full C type support would be better, but harder to fit in an Expr
Expression[sname[0...-1]]
end
# in other cases, stop trace
end
}
end
# found a special member of a struct, check if we can apply
# bitfield/enum name to other constants in the di
def trace_update_reg_structptr_arg_enum(di, ind, mb, str)
if ename = mb.has_attribute_var('enum') and enum = c_parser.toplevel.struct[ename] and enum.kind_of?(C::Enum)
# handle enums: struct moo { int __attribute__((enum(bla))) fld; };
doit = lambda { |_di|
if num = _di.instruction.args.grep(Expression).first and num_i = num.reduce and num_i.kind_of?(::Integer)
# handle enum values on tagged structs
if enum.members and name = enum.members.index(num_i)
num.lexpr = nil
num.op = :+
num.rexpr = ExpressionString.new(Expression[num_i], name, :enum)
_di.add_comment "enum::#{ename}" if _di.address != di.address
end
end
}
doit[di]
# mov eax, [ptr+struct.enumfield] => trace eax
if reg = @cpu.instr_args_regs(di).find { |r| v = di.backtrace_binding[r.symbolic] and (v - ind.symbolic) == 0 }
reg = reg.symbolic
trace_function_register(di.address, reg => Expression[0]) { |_di, r, val, trace|
next if r != reg and val != Expression[reg]
doit[_di]
val
}
end
elsif mb.untypedef.kind_of?(C::Struct)
# handle bitfields
byte_off = 0
if str =~ /\+(\d+)$/
# test byte [bitfield+1], 0x1 => test dword [bitfield], 0x100
# XXX little-endian only
byte_off = $1.to_i
str[/\+\d+$/] = ''
end
cmt = str.split('.')[-2, 2].join('.') if str.count('.') > 1
doit = lambda { |_di, add|
if num = _di.instruction.args.grep(Expression).first and num_i = num.reduce and num_i.kind_of?(::Integer)
# TODO handle ~num_i
num_left = num_i << add
s_or = []
mb.untypedef.members.each { |mm|
if bo = mb.bitoffsetof(c_parser, mm)
boff, blen = bo
if mm.name && blen == 1 && ((num_left >> boff) & 1) > 0
s_or << mm.name
num_left &= ~(1 << boff)
end
end
}
if s_or.first
if num_left != 0
s_or << ('0x%X' % num_left)
end
s = s_or.join('|')
num.lexpr = nil
num.op = :+
num.rexpr = ExpressionString.new(Expression[num_i], s, :bitfield)
_di.add_comment cmt if _di.address != di.address
end
end
}
doit[di, byte_off*8]
if reg = @cpu.instr_args_regs(di).find { |r| v = di.backtrace_binding[r.symbolic] and (v - ind.symbolic) == 0 }
reg = reg.symbolic
trace_function_register(di.address, reg => Expression[0]) { |_di, r, val, trace|
if r.kind_of?(Expression) and r.op == :&
if r.lexpr == reg
# test al, 42
doit[_di, byte_off*8]
elsif r.lexpr.kind_of?(Expression) and r.lexpr.op == :>> and r.lexpr.lexpr == reg
# test ah, 42
doit[_di, byte_off*8+r.lexpr.rexpr]
end
end
next if r != reg and val != Expression[reg]
doit[_di, byte_off*8]
_di.address == di.address && r == reg ? Expression[0] : val
}
end
end
end
# change Expression display mode for current object o to display integers as char constants
def toggle_expr_char(o)
return if not o.kind_of?(Renderable)
tochars = lambda { |v|
if v.kind_of?(::Integer)
a = []
vv = v.abs
a << (vv & 0xff)
vv >>= 8
while vv > 0
a << (vv & 0xff)
vv >>= 8
end
if a.all? { |b| b < 0x7f }
s = a.pack('C*').inspect.gsub("'") { '\\\'' }[1...-1]
ExpressionString.new(v, (v > 0 ? "'#{s}'" : "-'#{s}'"), :char)
end
end
}
return if not o.kind_of? Renderable
o.each_expr { |e|
if e.kind_of?(Expression)
if nr = tochars[e.rexpr]
e.rexpr = nr
elsif e.rexpr.kind_of?(ExpressionString) and e.rexpr.type == :char
e.rexpr = e.rexpr.expr
end
if nl = tochars[e.lexpr]
e.lexpr = nl
elsif e.lexpr.kind_of?(ExpressionString) and e.lexpr.type == :char
e.lexpr = e.lexpr.expr
end
end
}
end
def toggle_expr_dec(o)
return if not o.kind_of?(Renderable)
o.each_expr { |e|
if e.kind_of?(Expression)
if e.rexpr.kind_of?(::Integer)
e.rexpr = ExpressionString.new(Expression[e.rexpr], e.rexpr.to_s, :decimal)
elsif e.rexpr.kind_of?(ExpressionString) and e.rexpr.type == :decimal
e.rexpr = e.rexpr.reduce
end
if e.lexpr.kind_of?(::Integer)
e.lexpr = ExpressionString.new(Expression[e.lexpr], e.lexpr.to_s, :decimal)
elsif e.lexpr.kind_of?(ExpressionString) and e.lexpr.type == :decimal
e.lexpr = e.lexpr.reduce
end
end
e.render_info ||= {}
e.render_info[:char] = e.render_info[:char] ? nil : @cpu.endianness
}
end
@@ -1645,7 +1118,6 @@ class Disassembler
def toggle_expr_offset(o)
return if not o.kind_of? Renderable
o.each_expr { |e|
next unless e.kind_of?(Expression)
if n = @prog_binding[e.lexpr]
e.lexpr = n
elsif e.lexpr.kind_of? ::Integer and n = get_label_at(e.lexpr)
@@ -1661,15 +1133,6 @@ class Disassembler
}
end
# toggle all ExpressionStrings
def toggle_expr_str(o)
return if not o.kind_of?(Renderable)
o.each_expr { |e|
next unless e.kind_of?(ExpressionString)
e.hide_str = !e.hide_str
}
end
# call this function on a function entrypoint if the function is in fact a __noreturn
# will cut the to_subfuncret of callers
def fix_noreturn(o)
@@ -1721,7 +1184,7 @@ class Disassembler
# searched for in the Metasmdir/samples/dasm-plugins subdirectory if not found in cwd
def load_plugin(plugin_filename)
if not File.exist?(plugin_filename)
if File.exist?(plugin_filename+'.rb')
if File.exist?(plugin_filename+'.rb')
plugin_filename += '.rb'
elsif defined? Metasmdir
# try autocomplete
@@ -1762,7 +1225,7 @@ class Disassembler
if bd2.kind_of? DecodedInstruction
bd2 = bd2.backtrace_binding ||= cpu.get_backtrace_binding(bd2)
end
reduce = lambda { |e| Expression[Expression[e].reduce] }
bd = {}
@@ -1813,31 +1276,5 @@ class Disassembler
bd
end
def gui_hilight_word_regexp(word)
@cpu.gui_hilight_word_regexp(word)
end
# return a C::AllocCStruct from c_parser
# TODO handle program.class::Header.to_c_struct
def decode_c_struct(structname, addr)
if c_parser and edata = get_edata_at(addr)
c_parser.decode_c_struct(structname, edata.data, edata.ptr)
end
end
def decode_c_ary(structname, addr, len)
if c_parser and edata = get_edata_at(addr)
c_parser.decode_c_ary(structname, len, edata.data, edata.ptr)
end
end
# find the function containing addr, and find & rename stack vars in it
def name_local_vars(addr)
if @cpu.respond_to?(:name_local_vars) and faddr = find_function_start(addr)
@function[faddr] ||= DecodedFunction.new # XXX
@cpu.name_local_vars(self, faddr)
end
end
end
end
+119 -208
View File
@@ -52,17 +52,15 @@ extern VALUE *rb_cObject __attribute__((import));
extern VALUE *rb_eRuntimeError __attribute__((import));
extern VALUE *rb_eArgError __attribute__((import));
#define Qfalse ((VALUE)0)
#define Qtrue ((VALUE)2)
#define Qnil ((VALUE)4)
// allows generating a ruby1.9 dynldr.so from ruby1.8
#ifndef DYNLDR_RUBY_19
#define DYNLDR_RUBY_19 #{RUBY_VERSION >= '1.9' ? 1 : 0}
#endif
#if #{RUBY_VERSION >= '2.0' ? 1 : 0}
// flonums. WHY?
// also breaks Qtrue/Qnil
#define rb_float_new rb_float_new_in_heap
#endif
#if DYNLDR_RUBY_19
#define T_STRING 0x05
#define T_ARRAY 0x07
@@ -165,7 +163,7 @@ static VALUE memory_write(VALUE self, VALUE addr, VALUE val)
static VALUE memory_write_int(VALUE self, VALUE addr, VALUE val)
{
*(uintptr_t *)VAL2INT(addr) = VAL2INT(val);
return 1;
return Qtrue;
}
static VALUE str_ptr(VALUE self, VALUE str)
@@ -202,7 +200,7 @@ static VALUE sym_addr(VALUE self, VALUE lib, VALUE func)
if (TYPE(func) != T_STRING && TYPE(func) != T_FIXNUM)
rb_raise(*rb_eArgError, "Invalid func");
if (TYPE(func) == T_FIXNUM)
p = os_load_sym_ord(h, VAL2INT(func));
else
@@ -226,7 +224,7 @@ static VALUE invoke(VALUE self, VALUE ptr, VALUE args, VALUE flags)
{
if (TYPE(args) != T_ARRAY || ARY_LEN(args) > 64)
rb_raise(*rb_eArgError, "bad args");
uintptr_t flags_v = VAL2INT(flags);
uintptr_t ptr_v = VAL2INT(ptr);
unsigned i, argsz;
@@ -243,7 +241,7 @@ static VALUE invoke(VALUE self, VALUE ptr, VALUE args, VALUE flags)
ret = do_invoke_stdcall(ptr_v, argsz, args_c);
else
ret = do_invoke(ptr_v, argsz, args_c);
if (flags_v & 4)
return rb_ull2inum((unsigned __int64)ret);
else if (flags_v & 8)
@@ -259,27 +257,23 @@ static VALUE invoke(VALUE self, VALUE ptr, VALUE args, VALUE flags)
// callback generated by callback_alloc
// heavy stack magick at work here !
// TODO float args / float retval / ret __int64
uintptr_t do_callback_handler(uintptr_t ori_retaddr, uintptr_t caller_id, uintptr_t arg0, uintptr_t arg_ecx __attribute__((register(ecx))), uintptr_t arg_edx __attribute__((register(edx))))
uintptr_t do_callback_handler(uintptr_t ori_retaddr, uintptr_t caller_id, uintptr_t arg0)
{
uintptr_t *addr = &arg0;
unsigned i, ret;
VALUE args = rb_ary_new2(10);
// __fastcall callback args
ARY_PTR(args)[0] = INT2VAL(arg_ecx);
ARY_PTR(args)[1] = INT2VAL(arg_edx);
VALUE args = rb_ary_new2(8);
// copy our args to a ruby-accessible buffer
for (i=2U ; i<10U ; ++i)
for (i=0U ; i<8U ; ++i)
ARY_PTR(args)[i] = INT2VAL(*addr++);
RArray(args)->len = 10U; // len == 10, no need to ARY_LEN/EMBED stuff
RArray(args)->len = 8U; // len == 8, no need to ARY_LEN/EMBED stuff
ret = rb_funcall(dynldr, rb_intern("callback_run"), 2, INT2VAL(caller_id), args);
// dynldr.callback will give us the arity (in bytes) of the callback in args[0]
// we just put the stack lifting offset in caller_id for the asm stub to use
caller_id = VAL2INT(ARY_PTR(args)[0]);
return VAL2INT(ret);
}
@@ -296,7 +290,7 @@ static VALUE invoke(VALUE self, VALUE ptr, VALUE args, VALUE flags)
{
if (TYPE(args) != T_ARRAY || ARY_LEN(args) > 16)
rb_raise(*rb_eArgError, "bad args");
uintptr_t flags_v = VAL2INT(flags);
uintptr_t ptr_v = VAL2INT(ptr);
int i, argsz;
@@ -318,7 +312,7 @@ static VALUE invoke(VALUE self, VALUE ptr, VALUE args, VALUE flags)
args_c[4], args_c[5], args_c[6], args_c[7],
args_c[8], args_c[9], args_c[10], args_c[11],
args_c[12], args_c[13], args_c[14], args_c[15]);
if (flags_v & 8)
return rb_float_new(fake_float());
@@ -385,7 +379,8 @@ static void *wstrcaseruby(short *s1, int len)
{
int i = 0;
int match = 0;
char *want = "ruby"; // cant contain the same letter twice
static char *want = "ruby"; // cant contain the same letter twice
while (i < len) {
if (want[match] == (s1[i] | 0x20)) { // downcase cmp
@@ -479,11 +474,11 @@ int load_ruby_imports(uintptr_t rbaddr)
if (rbaddr)
ruby_module = find_ruby_module_mem(rbaddr);
else
ruby_module = find_ruby_module_peb();
ruby_module = find_ruby_module_peb();
if (!ruby_module)
return 0;
ptr = &ruby_import_table;
table = (char*)ptr;
@@ -499,7 +494,7 @@ int load_ruby_imports(uintptr_t rbaddr)
#ifdef __x86_64__
#define DLL_PROCESS_ATTACH 1
int DllMain(void *handle, int reason, void *res)
__stdcall int DllMain(void *handle, int reason, void *res)
{
if (reason == DLL_PROCESS_ATTACH)
return load_ruby_imports(0);
@@ -514,7 +509,7 @@ EOS
do_invoke_fastcall:
push ebp
mov ebp, esp
// load ecx/edx, fix arg/argcount
mov eax, [ebp+16]
mov ecx, [eax]
@@ -632,7 +627,7 @@ EOS
# save the shared library
bin.encode_file(modulename, :lib)
end
def self.compile_binary_module_hack(bin)
# this is a hack
# we need the module to use ruby symbols
@@ -770,7 +765,7 @@ EOS
else raise LoadError, "Unsupported host platform #{RUBY_PLATFORM}"
end
end
# returns whether we run on linux or windows
def self.host_arch
case RUBY_PLATFORM
@@ -793,73 +788,16 @@ EOS
cp.parse(src)
end
# compile a C fragment into a Shellcode_RWX, honors the host ABI
# compile a C fragment into a Shellcode, honors the host ABI
def self.compile_c(src)
# XXX could we reuse self.cp ? (for its macros etc)
cp = C::Parser.new(host_exe.new(host_cpu))
cp.parse(src)
sc = Shellcode_RWX.new(host_cpu)
sc = Shellcode.new(host_cpu)
asm = host_cpu.new_ccompiler(cp, sc).compile
sc.assemble(asm)
end
# maps a Shellcode_RWX in memory, fixup stdlib relocations
# returns the Shellcode_RWX, with the base_r/w/x initialized to the allocated memory
def self.sc_map_resolve(sc)
sc_map_resolve_addthunks(sc)
sc.base_r = memory_alloc(sc.encoded_r.length) if sc.encoded_r.length > 0
sc.base_w = memory_alloc(sc.encoded_w.length) if sc.encoded_w.length > 0
sc.base_x = memory_alloc(sc.encoded_x.length) if sc.encoded_x.length > 0
locals = sc.encoded_r.export.keys | sc.encoded_w.export.keys | sc.encoded_x.export.keys
exts = sc.encoded_r.reloc_externals(locals) | sc.encoded_w.reloc_externals(locals) | sc.encoded_x.reloc_externals(locals)
bd = {}
exts.uniq.each { |ext| bd[ext] = sym_addr(lib_from_sym(ext), ext) or raise rescue raise "unknown symbol #{ext.inspect}" }
sc.fixup_check(bd)
memory_write sc.base_r, sc.encoded_r.data if sc.encoded_r.length > 0
memory_write sc.base_w, sc.encoded_w.data if sc.encoded_w.length > 0
memory_write sc.base_x, sc.encoded_x.data if sc.encoded_x.length > 0
memory_perm sc.base_r, sc.encoded_r.length, 'r' if sc.encoded_r.length > 0
memory_perm sc.base_w, sc.encoded_w.length, 'rw' if sc.encoded_w.length > 0
memory_perm sc.base_x, sc.encoded_x.length, 'rx' if sc.encoded_x.length > 0
sc
end
def self.sc_map_resolve_addthunks(sc)
case host_cpu.shortname
when 'x64'
# patch 'call moo' into 'call thunk; thunk: jmp qword [moo_ptr]'
# this is similar to ELF PLT section, allowing code to call
# into a library mapped more than 4G away
# XXX handles only 'call extern', not 'lea reg, extern' or anything else
# in this case, the linker will still raise an 'immediate overflow'
# during fixup_check in sc_map_resolve
[sc.encoded_r, sc.encoded_w, sc.encoded_x].each { |edata|
edata.reloc.dup.each { |off, rel|
# target only call extern / jmp.i32 extern
next if rel.type != :i32
next if rel.target.op != :-
next if edata.export[rel.target.rexpr] != off+4
next if edata.export[rel.target.lexpr]
opc = edata.data[off-1, 1].unpack('C')[0]
next if opc != 0xe8 and opc != 0xe9
thunk_sc = Shellcode.new(host_cpu).share_namespace(sc)
thunk = thunk_sc.assemble(<<EOS).encoded
1: jmp qword [rip]
dq #{rel.target.lexpr}
EOS
edata << thunk
rel.target.lexpr = thunk.inv_export[0]
}
}
end
end
# retrieve the library where a symbol is to be found (uses AutoImport)
def self.lib_from_sym(symname)
case host_arch
@@ -883,7 +821,7 @@ EOS
cp.toplevel.symbol.delete v.name
lib = fromlib || lib_from_sym(v.name)
addr = sym_addr(lib, v.name)
if addr == 0 or addr == -1 or addr == 0xffff_ffff or addr == 0xffffffff_ffffffff
if addr == 0 or addr == -1 or addr == 0xffff_ffff or addr == 0xffffffff_ffffffff
api_not_found(lib, v)
next
end
@@ -974,11 +912,11 @@ EOS
flags |= 4 if proto.type.type.integral? and cp.sizeof(nil, proto.type.type) == 8
flags |= 8 if proto.type.type.float?
class << self ; self ; end.send(:define_method, name) { |*a|
raise ArgumentError, "bad arg count for #{name}: #{a.length} for #{proto.type.args.to_a.length}" if a.length != proto.type.args.to_a.length and not proto.type.varargs
raise ArgumentError, "bad arg count for #{name}: #{a.length} for #{proto.type.args.length}" if a.length != proto.type.args.length and not proto.type.varargs
# convert the arglist suitably for raw_invoke
auto_cb = [] # list of automatic C callbacks generated from lambdas
a = a.zip(proto.type.args.to_a).map { |ra, fa|
a = a.zip(proto.type.args).map { |ra, fa|
aa = convert_rb2c(fa, ra, :cb_list => auto_cb)
if fa and fa.type.integral? and cp.sizeof(fa) == 8 and host_cpu.size == 32
aa = [aa & 0xffff_ffff, (aa >> 32) & 0xffff_ffff]
@@ -1027,10 +965,6 @@ EOS
raise "invalid callback #{'%x' % id} not in #{@@callback_table.keys.map { |c| c.to_s(16) }}" if not cb
rawargs = args.dup
if host_cpu.shortname == 'ia32' and (not cb[:proto_ori] or not cb[:proto_ori].has_attribute('fastcall'))
rawargs.shift
rawargs.shift
end
ra = cb[:proto] ? cb[:proto].args.map { |fa| convert_cbargs_c2rb(fa, rawargs) } : []
# run it
@@ -1061,7 +995,6 @@ EOS
# XXX val is an integer, how to decode Floats etc ? raw binary ptr ?
def self.convert_c2rb(formal, val)
formal = formal.type if formal.kind_of? C::Variable
val &= (1 << 8*cp.sizeof(formal))-1 if formal.integral?
val = Expression.make_signed(val, 8*cp.sizeof(formal)) if formal.integral? and formal.signed?
val = nil if formal.pointer? and val == 0
val
@@ -1086,8 +1019,13 @@ EOS
if (v and v.initializer) or cp.toplevel.statements.find { |st| st.kind_of? C::Asm }
cp.toplevel.statements.delete_if { |st| st.kind_of? C::Asm }
cp.toplevel.symbol.delete v.name if v
sc = sc_map_resolve(compile_c(proto))
sc.base_x
sc = compile_c(proto)
ptr = memory_alloc(sc.encoded.length)
sc.base_addr = ptr
# TODO fixup external calls
memory_write ptr, sc.encode_string
memory_perm ptr, sc.encoded.length, 'rwx'
ptr
elsif not v
raise 'empty prototype'
else
@@ -1106,7 +1044,6 @@ EOS
cb[:id] = id
cb[:proc] = b
cb[:proto] = proto
cb[:proto_ori] = ori
cb[:abi_stackfix] = proto.args.inject(0) { |s, a| s + [cp.sizeof(a), cp.typesize[:ptr]].max } if ori and ori.has_attribute('stdcall')
cb[:abi_stackfix] = proto.args[2..-1].to_a.inject(0) { |s, a| s + [cp.sizeof(a), cp.typesize[:ptr]].max } if ori and ori.has_attribute('fastcall') # supercedes stdcall
@@callback_table[id] = cb
@@ -1121,34 +1058,29 @@ EOS
# finds a free callback id, allocates a new page if needed
def self.callback_find_id
if not id = @@callback_addrs.find { |a| not @@callback_table[a] }
page_size = 4096
cb_page = memory_alloc(page_size)
cb_page = memory_alloc(4096)
sc = Shellcode.new(host_cpu, cb_page)
case sc.cpu.shortname
when 'ia32'
asm = "call #{CALLBACK_TARGET}"
addr = cb_page
nrcb = 128 # TODO should be 4096/5, but the parser/compiler is really too slow
nrcb.times {
@@callback_addrs << addr
sc.parse "call #{CALLBACK_TARGET}"
addr += 5
}
when 'x64'
if (cb_page - CALLBACK_TARGET).abs >= 0x7fff_f000
# cannot directly 'jmp CB_T'
asm = "1: mov rax, #{CALLBACK_TARGET} push rax lea rax, [rip-$_+1b] ret"
else
asm = "1: lea rax, [rip-$_+1b] jmp #{CALLBACK_TARGET}"
end
else
raise 'Who are you?'
addr = cb_page
nrcb = 128 # same remark
nrcb.times {
@@callback_addrs << addr
sc.parse "1: lea rax, [rip-$_+1b] jmp #{CALLBACK_TARGET}"
addr += 12 # XXX approximative..
}
end
# fill the page with valid callbacks
loop do
off = sc.encoded.length
sc.assemble asm
break if sc.encoded.length > page_size
@@callback_addrs << (cb_page + off)
end
memory_write cb_page, sc.encode_string[0, page_size]
memory_perm cb_page, page_size, 'rx'
sc.assemble
memory_write cb_page, sc.encode_string
memory_perm cb_page, 4096, 'rx'
raise 'callback_alloc bouh' if not id = @@callback_addrs.find { |a| not @@callback_table[a] }
end
id
@@ -1158,17 +1090,23 @@ EOS
# returns the raw pointer to the code page
# if given a block, run the block and then undefine all the C functions & free memory
def self.new_func_c(src)
sc = sc_map_resolve(compile_c(src))
sc = compile_c(src)
ptr = memory_alloc(sc.encoded.length)
sc.base_addr = ptr
bd = sc.encoded.binding(ptr)
sc.encoded.reloc_externals.uniq.each { |ext| bd[ext] = sym_addr(lib_from_sym(ext), ext) or raise "unknown symbol #{ext}" }
sc.encoded.fixup(bd)
memory_write ptr, sc.encode_string
memory_perm ptr, sc.encoded.length, 'rwx'
parse_c(src) # XXX the Shellcode parser may have defined stuff / interpreted C another way...
defs = []
cp.toplevel.symbol.dup.each_value { |v|
next if not v.kind_of? C::Variable
cp.toplevel.symbol.delete v.name
next if not v.type.kind_of? C::Function or not v.initializer
next if not off = sc.encoded_x.export[v.name]
next if not off = sc.encoded.export[v.name]
rbname = c_func_name_to_rb(v.name)
new_caller_for(v, rbname, sc.base_x+off)
new_caller_for(v, rbname, ptr+off)
defs << rbname
}
if block_given?
@@ -1176,20 +1114,16 @@ EOS
yield
ensure
defs.each { |d| class << self ; self ; end.send(:remove_method, d) }
memory_free sc.base_r if sc.base_r
memory_free sc.base_w if sc.base_w
memory_free sc.base_x if sc.base_x
memory_free ptr
end
else
sc.base_x
ptr
end
end
# compile an asm sequence, callable with the ABI of the C prototype given
# function name comes from the prototype
# the shellcode is mapped in read-only memory unless selfmodifyingcode is true
# note that you can use a .data section for simple writable non-executable memory
def self.new_func_asm(proto, asm, selfmodifyingcode=false)
def self.new_func_asm(proto, asm)
proto += "\n;"
old = cp.toplevel.symbol.keys
parse_c(proto)
@@ -1199,38 +1133,38 @@ EOS
raise "invalid func proto #{proto}" if not f.name or not f.type.kind_of? C::Function or f.initializer
cp.toplevel.symbol.delete f.name
sc = Shellcode_RWX.assemble(host_cpu, asm)
sc = sc_map_resolve(sc)
if selfmodifyingcode
memory_perm sc.base_x, sc.encoded_x.length, 'rwx'
end
sc = Shellcode.assemble(host_cpu, asm)
ptr = memory_alloc(sc.encoded.length)
bd = sc.encoded.binding(ptr)
sc.encoded.reloc_externals.uniq.each { |ext| bd[ext] = sym_addr(lib_from_sym(ext), ext) or raise "unknown symbol #{ext}" }
sc.encoded.fixup(bd)
memory_write ptr, sc.encode_string
memory_perm ptr, sc.encoded.length, 'rwx'
rbname = c_func_name_to_rb(f.name)
new_caller_for(f, rbname, sc.base_x)
new_caller_for(f, rbname, ptr)
if block_given?
begin
yield
ensure
class << self ; self ; end.send(:remove_method, rbname)
memory_free sc.base_r if sc.base_r
memory_free sc.base_w if sc.base_w
memory_free sc.base_x
memory_free ptr
end
else
sc.base_x
ptr
end
end
end
# allocate a C::AllocCStruct to hold a specific struct defined in a previous new_api_c
def self.alloc_c_struct(structname, values={})
cp.alloc_c_struct(structname, values)
end
end
# return a C::AllocCStruct mapped over the string (with optionnal offset)
# str may be an EncodedData
def self.decode_c_struct(structname, str, off=0)
str = str.data if str.kind_of? EncodedData
cp.decode_c_struct(structname, str, off)
end
end
# allocate a C::AllocCStruct holding an Array of typename variables
# if len is an int, it holds the ary length, or it can be an array of initialisers
@@ -1300,70 +1234,69 @@ EOS
when :windows
new_api_c <<EOS, 'kernel32'
#define PAGE_NOACCESS 0x01
#define PAGE_READONLY 0x02
#define PAGE_READWRITE 0x04
#define PAGE_WRITECOPY 0x08
#define PAGE_EXECUTE 0x10
#define PAGE_EXECUTE_READ 0x20
#define PAGE_EXECUTE_READWRITE 0x40
#define PAGE_EXECUTE_WRITECOPY 0x80
#define PAGE_GUARD 0x100
#define PAGE_NOCACHE 0x200
#define PAGE_WRITECOMBINE 0x400
#define PAGE_NOACCESS 0x01
#define PAGE_READONLY 0x02
#define PAGE_READWRITE 0x04
#define PAGE_WRITECOPY 0x08
#define PAGE_EXECUTE 0x10
#define PAGE_EXECUTE_READ 0x20
#define PAGE_EXECUTE_READWRITE 0x40
#define PAGE_EXECUTE_WRITECOPY 0x80
#define PAGE_GUARD 0x100
#define PAGE_NOCACHE 0x200
#define PAGE_WRITECOMBINE 0x400
#define MEM_COMMIT 0x1000
#define MEM_RESERVE 0x2000
#define MEM_DECOMMIT 0x4000
#define MEM_RELEASE 0x8000
#define MEM_FREE 0x10000
#define MEM_PRIVATE 0x20000
#define MEM_MAPPED 0x40000
#define MEM_RESET 0x80000
#define MEM_TOP_DOWN 0x100000
#define MEM_WRITE_WATCH 0x200000
#define MEM_PHYSICAL 0x400000
#define MEM_LARGE_PAGES 0x20000000
#define MEM_4MB_PAGES 0x80000000
#define MEM_COMMIT 0x1000
#define MEM_RESERVE 0x2000
#define MEM_DECOMMIT 0x4000
#define MEM_RELEASE 0x8000
#define MEM_FREE 0x10000
#define MEM_PRIVATE 0x20000
#define MEM_MAPPED 0x40000
#define MEM_RESET 0x80000
#define MEM_TOP_DOWN 0x100000
#define MEM_WRITE_WATCH 0x200000
#define MEM_PHYSICAL 0x400000
#define MEM_LARGE_PAGES 0x20000000
#define MEM_4MB_PAGES 0x80000000
__stdcall uintptr_t VirtualAlloc(uintptr_t addr, uintptr_t size, int type, int prot);
__stdcall uintptr_t VirtualFree(uintptr_t addr, uintptr_t size, int freetype);
__stdcall uintptr_t VirtualProtect(uintptr_t addr, uintptr_t size, int prot, int *oldprot);
EOS
# allocate some memory suitable for code allocation (ie VirtualAlloc)
def self.memory_alloc(sz)
virtualalloc(nil, sz, MEM_RESERVE|MEM_COMMIT, PAGE_READWRITE)
end
# free memory allocated through memory_alloc
def self.memory_free(addr)
virtualfree(addr, 0, MEM_RELEASE)
end
# change memory permissions - perm in [r rw rx rwx]
def self.memory_perm(addr, len, perm)
perm = { 'r' => PAGE_READONLY, 'rw' => PAGE_READWRITE, 'rx' => PAGE_EXECUTE_READ,
'rwx' => PAGE_EXECUTE_READWRITE }[perm.to_s.downcase]
virtualprotect(addr, len, perm, str_ptr([0].pack('C')*8))
end
when :linux
new_api_c <<EOS
#define PROT_READ 0x1
#define PROT_READ 0x1
#define PROT_WRITE 0x2
#define PROT_EXEC 0x4
#define PROT_EXEC 0x4
#define MAP_PRIVATE 0x2
#define MAP_FIXED 0x10
#define MAP_PRIVATE 0x2
#define MAP_ANONYMOUS 0x20
uintptr_t mmap(uintptr_t addr, uintptr_t length, int prot, int flags, uintptr_t fd, uintptr_t offset);
uintptr_t munmap(uintptr_t addr, uintptr_t length);
uintptr_t mprotect(uintptr_t addr, uintptr_t len, int prot);
EOS
# allocate some memory suitable for code allocation (ie mmap)
def self.memory_alloc(sz)
@mmaps ||= {} # save size for mem_free
@@ -1371,48 +1304,26 @@ EOS
@mmaps[a] = sz
a
end
# free memory allocated through memory_alloc
def self.memory_free(addr)
munmap(addr, @mmaps[addr])
end
# change memory permissions - perm 'rwx'
# on PaX-enabled systems, this may need a non-mprotect-restricted ruby interpreter
# if a mapping 'rx' is denied, will try to create a file and mmap() it rx in place
def self.memory_perm(addr, len, perm)
perm = perm.to_s.downcase
len += (addr & 0xfff) + 0xfff
len &= ~0xfff
addr &= ~0xfff
p = 0
p |= PROT_READ if perm.include?('r')
p |= PROT_WRITE if perm.include?('w')
p |= PROT_EXEC if perm.include?('x')
ret = mprotect(addr, len, p)
if ret != 0 and perm.include?('x') and not perm.include?('w') and len > 0 and @memory_perm_wd ||= find_write_dir
# We are on a PaX-mprotected system. Try to use a file mapping to work aroud.
Dir.chdir(@memory_perm_wd) {
fname = 'tmp_mprot_%d_%x' % [Process.pid, addr]
data = memory_read(addr, len)
begin
File.open(fname, 'w') { |fd| fd.write data }
# reopen to ensure filesystem flush
rret = File.open(fname, 'r') { |fd| mmap(addr, len, p, MAP_FIXED|MAP_PRIVATE, fd.fileno, 0) }
raise 'hax' if data != memory_read(addr, len)
ret = 0 if rret == addr
ensure
File.unlink(fname) rescue nil
end
}
end
ret
p |= PROT_READ if perm.include? 'r'
p |= PROT_WRITE if perm.include? 'w'
p |= PROT_EXEC if perm.include? 'x'
mprotect(addr, len, p)
end
end
end
end
+1 -10
View File
@@ -271,16 +271,7 @@ class Expression
def encode(type, endianness, backtrace=nil)
case val = reduce
when Integer; EncodedData.new Expression.encode_imm(val, type, endianness, backtrace)
else
str = case INT_SIZE[type]
when 8; "\0"
when 16; "\0\0"
when 32; "\0\0\0\0"
when 64; "\0\0\0\0\0\0\0\0"
else [0].pack('C')*(INT_SIZE[type]/8)
end
str = str.force_encoding('BINARY') if str.respond_to?(:force_encoding)
EncodedData.new(str, :reloc => {0 => Relocation.new(self, type, endianness, backtrace)})
else EncodedData.new([0].pack('C')*(INT_SIZE[type]/8), :reloc => {0 => Relocation.new(self, type, endianness, backtrace)})
end
end
+6 -6
View File
@@ -58,7 +58,7 @@ class AOut < ExeFormat
class Relocation < SerialStruct
word :address
bitfield :word, 0 => :symbolnum, 24 => :pcrel, 25 => :length,
27 => :extern, 28 => :baserel, 29 => :jmptable, 30 => :relative, 31 => :rtcopy
27 => :extern, 28 => :baserel, 29 => :jmptable, 30 => :relative, 31 => :rtcopy
fld_enum :length, 0 => 1, 1 => 2, 2 => 4, 3 => 8
fld_default :length, 4
end
@@ -68,7 +68,7 @@ class AOut < ExeFormat
bitfield :byte, 0 => :extern, 1 => :type, 5 => :stab
byte :other
half :desc
word :value
word :value
attr_accessor :name
def decode(aout, strings=nil)
@@ -119,11 +119,11 @@ class AOut < ExeFormat
@data = EncodedData.new << @encoded.read(@header.data)
textrel = @encoded.read @header.trsz
datarel = @encoded.read @header.drsz
syms = @encoded.read @header.syms
strings = @encoded.read
# TODO
#textrel = @encoded.read @header.trsz
#datarel = @encoded.read @header.drsz
#syms = @encoded.read @header.syms
#strings = @encoded.read
end
def encode
-1
View File
@@ -58,7 +58,6 @@ register_signature("\xca\xfe\xba\xbe") { UniversalBinary }
register_signature("dex\n") { DEX }
register_signature("dey\n") { DEY }
register_signature("\xfa\x70\x0e\x1f") { FatELF }
register_signature("\x50\x4b\x03\x04") { ZIP }
register_signature('Metasm.dasm') { Disassembler }
# replacement for AutoExe where #load defaults to a Shellcode of the specified CPU
+31 -60
View File
@@ -9,7 +9,6 @@ require 'metasm/decode'
module Metasm
# BFLT is the binary flat format used by the uClinux
# from examining a v4 binary, it looks like the header is discarded and the file is mapped from 0x40 to memory address 0 (wrt relocations)
class Bflt < ExeFormat
MAGIC = 'bFLT'
FLAGS = { 1 => 'RAM', 2 => 'GOTPIC', 4 => 'GZIP' }
@@ -30,20 +29,13 @@ class Bflt < ExeFormat
when MAGIC
else raise InvalidExeFormat, "Bad bFLT signature #@magic"
end
if @rev >= 0x01000000 and (@rev & 0x00f0ffff) == 0
puts "Bflt: probable wrong endianness, retrying" if $VERBOSE
exe.endianness = { :big => :little, :little => :big }[exe.endianness]
exe.encoded.ptr -= 4*16
super(exe)
end
end
def set_default_values(exe)
@magic ||= MAGIC
@rev ||= 4
@entry ||= 0x40
@data_start ||= 0x40 + exe.text.length if exe.text
@data_start ||= @entry + exe.text.length if exe.text
@data_end ||= @data_start + exe.data.data.length if exe.data
@bss_end ||= @data_start + exe.data.length if exe.data
@stack_size ||= 0x1000
@@ -58,7 +50,6 @@ class Bflt < ExeFormat
def decode_word(edata = @encoded) edata.decode_imm(:u32, @endianness) end
def encode_word(w) Expression[w].encode(:u32, @endianness) end
attr_accessor :endianness
def initialize(cpu = nil)
@endianness = cpu ? cpu.endianness : :little
@header = Header.new
@@ -70,17 +61,17 @@ class Bflt < ExeFormat
def decode_header
@encoded.ptr = 0
@header.decode(self)
@encoded.add_export(new_label('entrypoint'), @header.entry)
end
def decode
decode_header
@text = @encoded[0x40...@header.data_start]
@data = @encoded[@header.data_start...@header.data_end]
@data.virtsize += @header.bss_end - @header.data_end
@encoded.ptr = @header.entry
@text = EncodedData.new << @encoded.read(@header.data_start - @header.entry)
@data = EncodedData.new << @encoded.read(@header.data_end - @header.data_start)
@data.virtsize += (@header.bss_end - @header.data_end)
if @header.flags.include?('GZIP')
if @header.flags.include? 'GZIP'
# TODO gzip
raise 'bFLT decoder: gzip format not supported'
end
@@ -88,7 +79,7 @@ class Bflt < ExeFormat
@reloc = []
@encoded.ptr = @header.reloc_start
@header.reloc_count.times { @reloc << decode_word }
if @header.rev == 2
if @header.version == 2
@reloc.map! { |r| r & 0x3fff_ffff }
end
@@ -96,29 +87,32 @@ class Bflt < ExeFormat
end
def decode_interpret_relocs
textsz = @header.data_start-0x40
@reloc.each { |r|
# where the reloc is
if r < textsz
if r >= @header.entry and r < @header.data_start
section = @text
off = section.ptr = r
else
base = @header.entry
elsif r >= @header.data_start and r < @header.data_end
section = @data
off = section.ptr = r-textsz
end
# what it points to
target = decode_word(section)
if target < textsz
target = label_at(@text, target, "xref_#{Expression[target]}")
elsif target < @header.bss_end-0x40
target = label_at(@data, target-textsz, "xref_#{Expression[target]}")
base = @header.data_start
else
puts "out of bounds reloc target #{Expression[target]} at #{Expression[r]}" if $VERBOSE
puts "out of bounds reloc at #{Expression[r]}" if $VERBOSE
next
end
section.reloc[off] = Relocation.new(Expression[target], :u32, @endianness)
# what it points to
section.ptr = r-base
target = decode_word(section)
if target >= @header.entry and target < @header.data_start
target = label_at(@text, target - @header.entry, "xref_#{Expression[target]}")
elsif target >= @header.data_start and target < @header.bss_end
target = label_at(@data, target - @header.data_start, "xref_#{Expression[target]}")
else
puts "out of bounds reloc target at #{Expression[r]}" if $VERBOSE
next
end
@text.reloc[r-base] = Relocation.new(Expression[target], :u32, @endianness)
}
end
@@ -133,8 +127,8 @@ class Bflt < ExeFormat
@encoded = EncodedData.new
@encoded << @header.encode(self)
binding = @text.binding(0x40).merge(@data.binding(@header.data_start))
binding = @text.binding(@header.entry).merge(@data.binding(@header.data_start))
@encoded << @text << @data.data
@encoded.fixup! binding
@encoded.reloc.clear
@@ -149,7 +143,7 @@ class Bflt < ExeFormat
mapaddr = new_label('mapaddr')
binding = @text.binding(mapaddr).merge(@data.binding(mapaddr))
[@text, @data].each { |section|
base = 0x40 # XXX maybe 0 ?
base = @header.entry || 0x40
base = @header.data_start || base+@text.length if section == @data
section.reloc.each { |o, r|
if r.endianness == @endianness and [:u32, :a32, :i32].include? r.type and
@@ -173,16 +167,7 @@ class Bflt < ExeFormat
case instr.raw.downcase
when '.text'; @cursource = @textsrc
when '.data'; @cursource = @datasrc
when '.entrypoint'
# ".entrypoint <somelabel/expression>" or ".entrypoint" (here)
@lexer.skip_space
if tok = @lexer.nexttok and tok.type == :string
raise instr if not entrypoint = Expression.parse(@lexer)
else
entrypoint = new_label('entrypoint')
@cursource << Label.new(entrypoint, instr.backtrace.dup)
end
@header.entry = entrypoint
# entrypoint is the 1st byte of .text
else super(instr)
end
end
@@ -196,23 +181,9 @@ class Bflt < ExeFormat
self
end
def get_default_entrypoints
['entrypoint']
end
def each_section
yield @text, 0
yield @data, @header.data_start - @header.entry
end
def section_info
[['.text', 0, @text.length, 'rx'],
['.data', @header.data_addr-0x40, @data.data.length, 'rw'],
['.bss', @header.data_end-0x40, @data.length-@data.data.length, 'rw']]
end
def module_symbols
['entrypoint', @header.entry-0x40]
yield @text, @header.entry
yield @data, @header.data_start
end
end
end
+2 -2
View File
@@ -81,7 +81,7 @@ class COFF < ExeFormat
11 => 'UNION_MEMBER', 12 => 'UNION_TAG', 13 => 'TYPEDEF', 14 => 'UNDEF_STATIC',
15 => 'ENUM_TAG', 16 => 'ENUM_MEMBER', 17 => 'REG_PARAM', 18 => 'BIT_FIELD',
100 => 'BLOCK', 101 => 'FUNCTION', 102 => 'END_STRUCT',
103 => 'FILE', 104 => 'SECTION', 105 => 'WEAK_EXT',
103 => 'FILE', 104 => 'SECTION', 105 => 'WEAK_EXT',
}
DEBUG_TYPE = { 0 => 'UNKNOWN', 1 => 'COFF', 2 => 'CODEVIEW', 3 => 'FPO', 4 => 'MISC',
@@ -264,7 +264,7 @@ class COFF < ExeFormat
class TLSDirectory < SerialStruct
xwords :start_va, :end_va, :index_addr, :callback_p
words :zerofill_sz, :characteristics
words :zerofill_sz, :characteristics
attr_accessor :callbacks
end
+20 -52
View File
@@ -17,15 +17,13 @@ class COFF
# decodes a COFF optional header from coff.cursection
# also decodes directories in coff.directory
def decode(coff)
return set_default_values(coff) if coff.header.size_opthdr == 0 and not coff.header.characteristics.include?('EXECUTABLE_IMAGE')
off = coff.curencoded.ptr
return set_default_values(coff) if coff.header.size_opthdr == 0
super(coff)
nrva = (coff.header.size_opthdr - (coff.curencoded.ptr - off)) / 8
nrva = @numrva if nrva < 0
if nrva > DIRECTORIES.length or nrva != @numrva
puts "W: COFF: Weird directories count #{@numrva}" if $VERBOSE
nrva = DIRECTORIES.length if nrva > DIRECTORIES.length
nrva = @numrva
if @numrva > DIRECTORIES.length
puts "W: COFF: Invalid directories count #{@numrva}" if $VERBOSE
nrva = DIRECTORIES.length
end
coff.directory = {}
@@ -173,17 +171,17 @@ class COFF
end
class ResourceDirectory
def decode(coff, edata = coff.curencoded, startptr = edata.ptr, maxdepth=3)
def decode(coff, edata = coff.curencoded, startptr = edata.ptr)
super(coff, edata)
@entries = []
nrnames = @nr_names if $DEBUG
(@nr_names+@nr_id).times {
e = Entry.new
e = Entry.new
e_id = coff.decode_word(edata)
e_ptr = coff.decode_word(edata)
e_id = coff.decode_word(edata)
e_ptr = coff.decode_word(edata)
if not e_id.kind_of? Integer or not e_ptr.kind_of? Integer
puts 'W: COFF: relocs in the rsrc directory?' if $VERBOSE
@@ -215,12 +213,10 @@ class COFF
e.subdir_p = e_ptr & 0x7fff_ffff
if startptr + e.subdir_p >= edata.length
puts 'W: COFF: invalid resource structure: directory too far' if $VERBOSE
elsif maxdepth > 0
else
edata.ptr = startptr + e.subdir_p
e.subdir = ResourceDirectory.new
e.subdir.decode coff, edata, startptr, maxdepth-1
else
puts 'W: COFF: recursive resource section' if $VERBOSE
e.subdir.decode coff, edata, startptr
end
else
e.dataentry_p = e_ptr
@@ -248,8 +244,7 @@ class COFF
decode_tllv = lambda { |ed, state|
sptr = ed.ptr
len, vlen = coff.decode_half(ed), coff.decode_half(ed)
coff.decode_half(ed) # type
len, vlen, type = coff.decode_half(ed), coff.decode_half(ed), coff.decode_half(ed)
tagname = ''
while c = coff.decode_half(ed) and c != 0
tagname << (c&255)
@@ -278,7 +273,7 @@ class COFF
when :str
val = ed.read(vlen*2).unpack('v*')
val.pop if val[-1] == 0
val = val.pack('C*') if val.all? { |c_| c_ > 0 and c_ < 256 }
val = val.pack('C*') if val.all? { |c_| c_ > 0 and c_ < 256 }
vers[tagname] = val
when :var
val = ed.read(vlen).unpack('V*')
@@ -431,7 +426,8 @@ class COFF
def sect_at_rva(rva)
return if not rva or rva <= 0
if sections and not @sections.empty?
if s = @sections.find { |s_| s_.virtaddr <= rva and s_.virtaddr + EncodedData.align_size((s_.virtsize == 0 ? s_.rawsize : s_.virtsize), @optheader.sect_align) > rva }
valign = lambda { |l| EncodedData.align_size(l, @optheader.sect_align) }
if s = @sections.find { |s_| s_.virtaddr <= rva and s_.virtaddr + valign[s_.virtsize] > rva }
s.encoded.ptr = rva - s.virtaddr
@cursection = s
elsif rva < @sections.map { |s_| s_.virtaddr }.min
@@ -483,7 +479,7 @@ class COFF
end
def each_section
if @header.size_opthdr == 0 and not @header.characteristics.include?('EXECUTABLE_IMAGE')
if @header.size_opthdr == 0
@sections.each { |s|
next if not s.encoded
l = new_label(s.name)
@@ -494,9 +490,7 @@ class COFF
end
base = @optheader.image_base
base = 0 if not base.kind_of? Integer
sz = @optheader.headers_size
sz = EncodedData.align_size(@optheader.image_size, 4096) if @sections.empty?
yield @encoded[0, sz], base
yield @encoded[0, @optheader.headers_size], base
@sections.each { |s| yield s.encoded, base + s.virtaddr }
end
@@ -572,10 +566,8 @@ class COFF
# decodes a section content (allows simpler LoadedPE override)
def decode_section_body(s)
raw = EncodedData.align_size(s.rawsize, @optheader.file_align)
virt = s.virtsize
virt = EncodedData.align_size(s.virtsize, @optheader.sect_align)
virt = raw = s.rawsize if @header.size_opthdr == 0
virt = raw if virt == 0
virt = EncodedData.align_size(virt, @optheader.sect_align)
s.encoded = @encoded[s.rawaddr, [raw, virt].min] || EncodedData.new
s.encoded.virtsize = virt
end
@@ -642,13 +634,8 @@ class COFF
if ct = @directory['certificate_table']
@certificates = []
@cursection = self
if ct[0] > @encoded.length or ct[1] > @encoded.length - ct[0]
puts "W: COFF: invalid certificate_table #{'0x%X+0x%0X' % ct}" if $VERBOSE
ct = [ct[0], 1]
end
@encoded.ptr = ct[0]
off_end = ct[0]+ct[1]
off_end = @encoded.length if off_end > @encoded.length
while @encoded.ptr < off_end
certlen = decode_word
certrev = decode_half
@@ -717,25 +704,6 @@ class COFF
end
end
def decode_reloc_amd64(r)
case r.type
when 'ABSOLUTE'
when 'HIGHLOW'
addr = decode_word
if s = sect_at_va(addr)
label = label_at(s.encoded, s.encoded.ptr, "xref_#{Expression[addr]}")
Metasm::Relocation.new(Expression[label], :u32, @endianness)
end
when 'DIR64'
addr = decode_xword
if s = sect_at_va(addr)
label = label_at(s.encoded, s.encoded.ptr, "xref_#{Expression[addr]}")
Metasm::Relocation.new(Expression[label], :u64, @endianness)
end
else puts "W: COFF: Unsupported amd64 relocation #{r.inspect}" if $VERBOSE
end
end
def decode_debug
if dd = @directory['debug'] and sect_at_rva(dd[0])
@debug = []
@@ -751,11 +719,11 @@ class COFF
def decode_tls
if @directory['tls_table'] and sect_at_rva(@directory['tls_table'][0])
@tls = TLSDirectory.decode(self)
if s = sect_at_va(@tls.callback_p)
if s = sect_at_va(@tls.callback_p)
s.encoded.add_export 'tls_callback_table'
@tls.callbacks.each_with_index { |cb, i|
@tls.callbacks[i] = curencoded.add_export "tls_callback_#{i}" if sect_at_rva(cb)
}
}
end
end
end
+14 -12
View File
@@ -139,7 +139,7 @@ class COFF
end
class ImportDirectory
# encode all import directories + iat
# encodes all import directories + iat
def self.encode(coff, ary)
edata = { 'iat' => [] }
%w[idata ilt nametable].each { |name| edata[name] = EncodedData.new }
@@ -160,11 +160,12 @@ class COFF
[it, iat]
end
# encode one import directory + iat + names in the edata hash received as arg
# encodes an import directory + iat + names in the edata hash received as arg
def encode(coff, edata)
edata['iat'] << EncodedData.new
# edata['ilt'] = edata['iat']
label = lambda { |n| coff.label_at(edata[n], 0, n) }
rva = lambda { |n| Expression[label[n], :-, coff.label_at(coff.encoded, 0)] }
rva_end = lambda { |n| Expression[[label[n], :-, coff.label_at(coff.encoded, 0)], :+, edata[n].virtsize] }
@libname_p = rva_end['nametable']
@@ -395,8 +396,7 @@ class COFF
s.characteristics = %w[MEM_READ MEM_WRITE MEM_DISCARDABLE]
encode_append_section s
if @imports.first and @imports.first.iat_p.kind_of?(Integer)
# ordiat = iat.sort_by { @import[x].iat_p }
if @imports.first and @imports.first.iat_p.kind_of? Integer
ordiat = @imports.zip(iat).sort_by { |id, it| id.iat_p.kind_of?(Integer) ? id.iat_p : 1<<65 }.map { |id, it| it }
else
ordiat = iat
@@ -413,7 +413,7 @@ class COFF
plt.characteristics = %w[MEM_READ MEM_EXECUTE]
@imports.zip(iat) { |id, it|
if id.iat_p.kind_of?(Integer) and @sections.find { |s_| s_.virtaddr <= id.iat_p and s_.virtaddr + (s_.virtsize || s_.encoded.virtsize) > id.iat_p }
if id.iat_p.kind_of? Integer and s = @sections.find { |s_| s_.virtaddr <= id.iat_p and s_.virtaddr + (s_.virtsize || s_.encoded.virtsize) > id.iat_p }
id.iat = it # will be fixed up after encode_section
else
# XXX should not be mixed (for @directory['iat'][1])
@@ -529,7 +529,9 @@ class COFF
end
# initialize reloc table base address if needed
rt.base_addr ||= off & ~0xfff
if not rt.base_addr
rt.base_addr = off & ~0xfff
end
(rt.relocs ||= []) << r
elsif $DEBUG and not rel.target.bind(binding).reduce.kind_of?(Integer)
@@ -557,7 +559,7 @@ class COFF
end
# initialize the header from target/cpu/etc, target in ['exe' 'dll' 'kmod' 'obj']
def pre_encode_header(target='exe', want_relocs=true)
def pre_encode_header(target = 'exe', want_relocs=true)
target = {:bin => 'exe', :lib => 'dll', :obj => 'obj', 'sys' => 'kmod', 'drv' => 'kmod'}.fetch(target, target)
@header.machine ||= case @cpu.shortname
@@ -648,11 +650,11 @@ class COFF
# append the section bodies to @encoded, and link the resulting binary
def encode_sections_fixup
@encoded.align @optheader.file_align
if @optheader.headers_size.kind_of?(::String)
@encoded.fixup! @optheader.headers_size => @encoded.virtsize
@optheader.headers_size = @encoded.virtsize
end
@encoded.align @optheader.file_align
baseaddr = @optheader.image_base.kind_of?(::Integer) ? @optheader.image_base : 0x400000
binding = @encoded.binding(baseaddr)
@@ -687,7 +689,7 @@ class COFF
# patch the iat where iat_p was defined
# sort to ensure a 0-terminated will not overwrite an entry
# (try to dump notepad.exe, which has a forwarder;)
@imports.find_all { |id| id.iat_p.kind_of?(Integer) }.sort_by { |id| id.iat_p }.each { |id|
@imports.find_all { |id| id.iat_p.kind_of? Integer }.sort_by { |id| id.iat_p }.each { |id|
s = sect_at_rva(id.iat_p)
@encoded[s.rawaddr + s.encoded.ptr, id.iat.virtsize] = id.iat
binding.update id.iat.binding(baseaddr + id.iat_p)
@@ -708,7 +710,7 @@ class COFF
# creates the base relocation tables (need for references to IAT not known before)
# defaults to generating relocatable files, eg ALSR-aware
# pass want_relocs=false to avoid the file overhead induced by this
def encode(target='exe', want_relocs=true)
def encode(target = 'exe', want_relocs = true)
@encoded = EncodedData.new
label_at(@encoded, 0, 'coff_start')
pre_encode_header(target, want_relocs)
@@ -830,7 +832,7 @@ class COFF
@lexer.unreadtok tok if not tok = @lexer.readtok or tok.type != :punct or tok.raw != '='
raise instr, 'invalid base' if not s.virtaddr = Expression.parse(@lexer).reduce or not s.virtaddr.kind_of?(::Integer)
if not @optheader.image_base
@optheader.image_base = (s.virtaddr-0x80) & 0xfff00000
@optheader.image_base = (s.virtaddr-0x80) & 0xfff00000
puts "Warning: no image_base specified, using #{Expression[@optheader.image_base]}" if $VERBOSE
end
s.virtaddr -= @optheader.image_base
@@ -1046,7 +1048,7 @@ class COFF
end
if not dll = autoexports[sym]
sym += fallback_append if sym.kind_of?(::String) and fallback_append.kind_of?(::String)
next if not dll = autoexports[sym]
next if not dll = autoexports[sym]
end
@imports ||= []
+5 -10
View File
@@ -135,7 +135,7 @@ class DEX < ExeFormat
class MethodId < SerialStruct
u2 :classidx
u2 :protoidx
u2 :typeidx
u4 :nameidx
end
@@ -182,7 +182,7 @@ class DEX < ExeFormat
uleb :fieldid_diff # this field id - array.previous field id
uleb :access
attr_accessor :fieldid, :field
attr_accessor :field
end
class EncodedMethod < SerialStruct
@@ -190,7 +190,7 @@ class DEX < ExeFormat
uleb :access
uleb :codeoff # offset to CodeItem
attr_accessor :methodid, :method, :code, :name
attr_accessor :method, :code, :name
end
class TypeItem < SerialStruct
@@ -256,7 +256,7 @@ class DEX < ExeFormat
uleb :typeidx
uleb :handleroff
end
class Link < SerialStruct
# undefined
end
@@ -390,7 +390,6 @@ class DEX < ExeFormat
(c.data.direct_methods + [0] + c.data.virtual_methods).each { |m|
next id=0 if m == 0
id += m.methodid_diff
m.methodid = id
m.method = @methods[id]
m.name = @strings[m.method.nameidx]
@encoded.ptr = m.codeoff
@@ -442,11 +441,7 @@ class DEX < ExeFormat
end
def get_default_entrypoints
@classes.find_all { |c| c.data }.map { |c|
(c.data.direct_methods + c.data.virtual_methods).map { |m|
m.codeoff+m.code.insns_off
}
}.flatten
[]
end
end
+4 -36
View File
@@ -52,9 +52,8 @@ class ELF < ExeFormat
0x8000_0000 => 'LEDATA'},
'SPARCV9' => {0 => 'TSO', 1 => 'PSO', 2 => 'RMO'}, # XXX not a flag
'MIPS' => {1 => 'NOREORDER', 2 => 'PIC', 4 => 'CPIC',
8 => 'XGOT', 0x10 => '64BIT_WHIRL', 0x20 => 'ABI2',
0x40 => 'ABI_ON32', 0x80 => 'OPTIONSFIRST',
0x100 => '32BITMODE'}
8 => 'XGOT', 16 => '64BIT_WHIRL', 32 => 'ABI2',
64 => 'ABI_ON32'}
}
DYNAMIC_TAG = { 0 => 'NULL', 1 => 'NEEDED', 2 => 'PLTRELSZ', 3 =>
@@ -301,37 +300,6 @@ class ELF < ExeFormat
112 => 'EMB_RELST_LO', 113 => 'EMB_RELST_HI',
114 => 'EMB_RELST_HA', 115 => 'EMB_BIT_FLD',
116 => 'EMB_RELSDA' },
'SH' => { 0 => 'NONE', 1 => 'DIR32', 2 => 'REL32', 3 => 'DIR8WPN',
4 => 'IND12W', 5 => 'DIR8WPL', 6 => 'DIR8WPZ', 7 => 'DIR8BP',
8 => 'DIR8W', 9 => 'DIR8L', 10 => 'LOOP_START', 11 => 'LOOP_END',
22 => 'GNU_VTINHERIT', 23 => 'GNU_VTENTRY', 24 => 'SWITCH8',
25 => 'SWITCH16', 26 => 'SWITCH32', 27 => 'USES', 28 => 'COUNT',
29 => 'ALIGN', 30 => 'CODE', 31 => 'DATA', 32 => 'LABEL',
33 => 'DIR16', 34 => 'DIR8', 35 => 'DIR8UL', 36 => 'DIR8UW',
37 => 'DIR8U', 38 => 'DIR8SW', 39 => 'DIR8S', 40 => 'DIR4UL',
41 => 'DIR4UW', 42 => 'DIR4U', 43 => 'PSHA', 44 => 'PSHL',
45 => 'DIR5U', 46 => 'DIR6U', 47 => 'DIR6S', 48 => 'DIR10S',
49 => 'DIR10SW', 50 => 'DIR10SL', 51 => 'DIR10SQ', 53 => 'DIR16S',
144 => 'TLS_GD_32', 145 => 'TLS_LD_32', 146 => 'TLS_LDO_32',
147 => 'TLS_IE_32', 148 => 'TLS_LE_32', 149 => 'TLS_DTPMOD32',
150 => 'TLS_DTPOFF32', 151 => 'TLS_TPOFF32', 160 => 'GOT32',
161 => 'PLT32', 162 => 'COPY', 163 => 'GLOB_DAT',
164 => 'JMP_SLOT', 165 => 'RELATIVE', 166 => 'GOTOFF',
167 => 'GOTPC', 168 => 'GOTPLT32', 169 => 'GOT_LOW16',
170 => 'GOT_MEDLOW16', 171 => 'GOT_MEDHI16', 172 => 'GOT_HI16',
173 => 'GOTPLT_LOW16', 174 => 'GOTPLT_MEDLOW16', 175 => 'GOTPLT_MEDHI16',
176 => 'GOTPLT_HI16', 177 => 'PLT_LOW16', 178 => 'PLT_MEDLOW16',
179 => 'PLT_MEDHI16', 180 => 'PLT_HI16', 181 => 'GOTOFF_LOW16',
182 => 'GOTOFF_MEDLOW16', 183 => 'GOTOFF_MEDHI16', 184 => 'GOTOFF_HI16',
185 => 'GOTPC_LOW16', 186 => 'GOTPC_MEDLOW16', 187 => 'GOTPC_MEDHI16',
188 => 'GOTPC_HI16', 189 => 'GOT10BY4', 190 => 'GOTPLT10BY4',
191 => 'GOT10BY8', 192 => 'GOTPLT10BY8', 193 => 'COPY64',
194 => 'GLOB_DAT64', 195 => 'JMP_SLOT64', 196 => 'RELATIVE64',
242 => 'SHMEDIA_CODE', 243 => 'PT_16', 244 => 'IMMS16',
245 => 'IMMU16', 246 => 'IMM_LOW16', 247 => 'IMM_LOW16_PCREL',
248 => 'IMM_MEDLOW16', 249 => 'IMM_MEDLOW16_PCREL', 250 => 'IMM_MEDHI16',
251 => 'IMM_MEDHI16_PCREL', 252 => 'IMM_HI16', 253 => 'IMM_HI16_PCREL',
254 => '64', 255 => '64_PCREL' },
'SPARC' => { 0 => 'NONE', 1 => '8', 2 => '16', 3 => '32',
4 => 'DISP8', 5 => 'DISP16', 6 => 'DISP32',
7 => 'WDISP30', 8 => 'WDISP22', 9 => 'HI22',
@@ -748,7 +716,7 @@ class FatELF < ExeFormat
f.encoded = e.encode_string
h = e.header
f.machine, f.abi, f.abi_version, f.e_class, f.data =
h.machine, h.abi, h.abi_version, h.e_class, h.data
h.machine, h.abi, h.abi_version, h.e_class, h.data
end
f.offset = new_label('fat_off')
f.size = f.encoded.size
@@ -844,7 +812,7 @@ typedef struct { /* Verneed Auxiliary Structure. */
Elf32_Word vna_next; /* no. of bytes from start of this */
} Elf32_Vernaux; /* vernaux to next vernaux entry */
typedef Elf32_Half Elf32_Versym; /* Version symbol index array */
typedef Elf32_Half Elf32_Versym; /* Version symbol index array */
typedef struct {
Elf32_Half si_boundto; /* direct bindings - symbol bound to */
+27 -133
View File
@@ -18,19 +18,19 @@ class ELF
case hdr.e_class
when '32'; elf.bitsize = 32
when '64', '64_icc'; elf.bitsize = 64
else puts "W: ELF: unsupported class #{hdr.e_class}, assuming 32bit"; elf.bitsize = 32
else raise InvalidExeFormat, "E: ELF: unsupported class #{hdr.e_class}"
end
case hdr.data
when 'LSB'; elf.endianness = :little
when 'MSB'; elf.endianness = :big
else puts "W: ELF: unsupported endianness #{hdr.data}, assuming littleendian"; elf.endianness = :little
else raise InvalidExeFormat, "E: ELF: unsupported endianness #{hdr.data}"
end
if hdr.i_version != 'CURRENT'
puts ":: ELF: unsupported ELF version #{hdr.i_version}"
raise InvalidExeFormat, "E: ELF: unsupported ELF version #{hdr.i_version}"
end
}
}
end
class Symbol
@@ -66,7 +66,7 @@ class ELF
# handles relocated LoadedELF
def addr_to_fileoff(addr)
la = module_address
la = (la == 0 ? (@load_address ||= 0) : 0)
la = (la == 0 ? (@load_address ||= 0) : 0)
addr_to_off(addr - la)
end
@@ -75,7 +75,7 @@ class ELF
def fileoff_to_addr(foff)
if s = @segments.find { |s_| s_.type == 'LOAD' and s_.offset <= foff and s_.offset + s_.filesz > foff }
la = module_address
la = (la == 0 ? (@load_address ||= 0) : 0)
la = (la == 0 ? (@load_address ||= 0) : 0)
s.vaddr + la + foff - s.offset
end
end
@@ -224,40 +224,7 @@ class ELF
# (gnu_hash(sym[N].name) & ~1) | (N == dynsymcount-1 || (gnu_hash(sym[N].name) % nbucket) != (gnu_hash(sym[N+1].name) % nbucket))
# that's the hash, with its lower bit replaced by the bool [1 if i am the last sym having my hash as hash]
# we're going to decode the symbol table, and we just want to get the nr of symbols to read
if just_get_count
# index of highest hashed (exported) symbols
ns = hsymcount+symndx
# no way to get the number of non-exported symbols from what we have here
# so we'll decode all relocs and use the largest index we see..
rels = []
if @encoded.ptr = @tag['REL'] and @tag['RELENT'] == Relocation.size(self)
p_end = @encoded.ptr + @tag['RELSZ']
while @encoded.ptr < p_end
rels << Relocation.decode(self)
end
end
if @encoded.ptr = @tag['RELA'] and @tag['RELAENT'] == RelocationAddend.size(self)
p_end = @encoded.ptr + @tag['RELASZ']
while @encoded.ptr < p_end
rels << RelocationAddend.decode(self)
end
end
if @encoded.ptr = @tag['JMPREL'] and relcls = case @tag['PLTREL']
when 'REL'; Relocation
when 'RELA'; RelocationAddend
end
p_end = @encoded.ptr + @tag['PLTRELSZ']
while @encoded.ptr < p_end
rels << relcls.decode(self)
end
end
maxr = rels.map { |rel| rel.symbol }.grep(::Integer).max || -1
return [ns, maxr+1].max
end
return hsymcount+symndx if just_get_count
# TODO
end
@@ -429,14 +396,12 @@ class ELF
raise 'Invalid symbol table' if sec.size > @encoded.length
(sec.size / Symbol.size(self)).times { syms << Symbol.decode(self, strtab) }
alreadysegs = true if @header.type == 'DYN' or @header.type == 'EXEC'
alreadysyms = @symbols.inject({}) { |h, s| h.update s.name => true } if alreadysegs
syms.each { |s|
if alreadysegs
# if we already decoded the symbols from the DYNAMIC segment,
# ignore dups and imports from this section
next if s.shndx == 'UNDEF'
next if alreadysyms[s.name]
alreadysyms[s.name] = true
next if @symbols.find { |ss| ss.name == s.name }
end
@symbols << s
decode_symbol_export(s)
@@ -544,28 +509,10 @@ class ELF
end
end
# returns the target of a relocation using reloc.symbol
# may create new labels if the relocation targets a section
def reloc_target(reloc)
target = 0
if reloc.symbol.kind_of?(Symbol)
if reloc.symbol.type == 'SECTION'
s = @sections[reloc.symbol.shndx]
if not target = @encoded.inv_export[s.offset]
target = new_label(s.name)
@encoded.add_export(target, s.offset)
end
elsif reloc.symbol.name
target = reloc.symbol.name
end
end
target
end
# returns the Metasm::Relocation that should be applied for reloc
# self.encoded.ptr must point to the location that will be relocated (for implicit addends)
def arch_decode_segments_reloc_386(reloc)
if reloc.symbol.kind_of?(Symbol) and n = reloc.symbol.name and reloc.symbol.shndx == 'UNDEF' and @sections and
if reloc.symbol and n = reloc.symbol.name and reloc.symbol.shndx == 'UNDEF' and @sections and
s = @sections.find { |s_| s_.name and s_.offset <= @encoded.ptr and s_.offset + s_.size > @encoded.ptr }
@encoded.add_export(new_label("#{s.name}_#{n}"), @encoded.ptr, true)
end
@@ -594,14 +541,15 @@ class ELF
when 'GLOB_DAT', 'JMP_SLOT', '32', 'PC32', 'TLS_TPOFF', 'TLS_TPOFF32'
# XXX use versionned version
# lazy jmp_slot ?
target = reloc_target(reloc)
target = 0
target = reloc.symbol.name if reloc.symbol.kind_of?(Symbol) and reloc.symbol.name
target = Expression[target, :-, reloc.offset] if reloc.type == 'PC32'
target = Expression[target, :+, addend] if addend and addend != 0
target = Expression[target, :+, 'tlsoffset'] if reloc.type == 'TLS_TPOFF'
target = Expression[:-, [target, :+, 'tlsoffset']] if reloc.type == 'TLS_TPOFF32'
when 'COPY'
# mark the address pointed as a copy of the relocation target
if not reloc.symbol.kind_of?(Symbol) or not name = reloc.symbol.name
if not reloc.symbol or not name = reloc.symbol.name
puts "W: Elf: symbol to COPY has no name: #{reloc.inspect}" if $VERBOSE
name = ''
end
@@ -619,40 +567,24 @@ class ELF
# returns the Metasm::Relocation that should be applied for reloc
# self.encoded.ptr must point to the location that will be relocated (for implicit addends)
def arch_decode_segments_reloc_mips(reloc)
if reloc.symbol.kind_of?(Symbol) and n = reloc.symbol.name and reloc.symbol.shndx == 'UNDEF' and @sections and
if reloc.symbol and n = reloc.symbol.name and reloc.symbol.shndx == 'UNDEF' and @sections and
s = @sections.find { |s_| s_.name and s_.offset <= @encoded.ptr and s_.offset + s_.size > @encoded.ptr }
@encoded.add_export(new_label("#{s.name}_#{n}"), @encoded.ptr, true)
end
original_word = decode_word
# decode addend if needed
case reloc.type
when 'NONE' # no addend
else addend = reloc.addend || Expression.make_signed(original_word, 32)
else addend = reloc.addend || decode_sword
end
case reloc.type
when 'NONE'
when '32', 'REL32'
target = reloc_target(reloc)
target = 0
target = reloc.symbol.name if reloc.symbol.kind_of?(Symbol) and reloc.symbol.name
target = Expression[target, :-, reloc.offset] if reloc.type == 'REL32'
target = Expression[target, :+, addend] if addend and addend != 0
when '26'
target = reloc_target(reloc)
addend &= 0x3ff_ffff
target = Expression[target, :+, [addend, :<<, 2]] if addend and addend != 0
target = Expression[[original_word, :&, 0xfc0_0000], :|, [[target, :&, 0x3ff_ffff], :>>, 2]]
when 'HI16'
target = reloc_target(reloc)
addend &= 0xffff
target = Expression[target, :+, [addend, :<<, 16]] if addend and addend != 0
target = Expression[[original_word, :&, 0xffff_0000], :|, [[target, :>>, 16], :&, 0xffff]]
when 'LO16'
target = reloc_target(reloc)
addend &= 0xffff
target = Expression[target, :+, addend] if addend and addend != 0
target = Expression[[original_word, :&, 0xffff_0000], :|, [target, :&, 0xffff]]
else
puts "W: Elf: unhandled MIPS reloc #{reloc.inspect}" if $VERBOSE
target = nil
@@ -664,7 +596,7 @@ class ELF
# returns the Metasm::Relocation that should be applied for reloc
# self.encoded.ptr must point to the location that will be relocated (for implicit addends)
def arch_decode_segments_reloc_x86_64(reloc)
if reloc.symbol.kind_of?(Symbol) and n = reloc.symbol.name and reloc.symbol.shndx == 'UNDEF' and @sections and
if reloc.symbol and n = reloc.symbol.name and reloc.symbol.shndx == 'UNDEF' and @sections and
s = @sections.find { |s_| s_.name and s_.offset <= @encoded.ptr and s_.offset + s_.size > @encoded.ptr }
@encoded.add_export(new_label("#{s.name}_#{n}"), @encoded.ptr, true)
end
@@ -695,13 +627,14 @@ class ELF
when 'GLOB_DAT', 'JMP_SLOT', '64', 'PC64', '32', 'PC32'
# XXX use versionned version
# lazy jmp_slot ?
target = reloc_target(reloc)
target = 0
target = reloc.symbol.name if reloc.symbol.kind_of?(Symbol) and reloc.symbol.name
target = Expression[target, :-, reloc.offset] if reloc.type == 'PC64' or reloc.type == 'PC32'
target = Expression[target, :+, addend] if addend and addend != 0
sz = :u32 if reloc.type == '32' or reloc.type == 'PC32'
when 'COPY'
# mark the address pointed as a copy of the relocation target
if not reloc.symbol.kind_of?(Symbol) or not name = reloc.symbol.name
if not reloc.symbol or not name = reloc.symbol.name
puts "W: Elf: symbol to COPY has no name: #{reloc.inspect}" if $VERBOSE
name = ''
end
@@ -716,33 +649,6 @@ class ELF
Metasm::Relocation.new(Expression[target], sz, @endianness) if target
end
def arch_decode_segments_reloc_sh(reloc)
if reloc.symbol.kind_of?(Symbol) and n = reloc.symbol.name and reloc.symbol.shndx == 'UNDEF' and @sections and
s = @sections.find { |s_| s_.name and s_.offset <= @encoded.ptr and s_.offset + s_.size > @encoded.ptr }
@encoded.add_export(new_label("#{s.name}_#{n}"), @encoded.ptr, true)
end
original_word = decode_word
# decode addend if needed
case reloc.type
when 'NONE' # no addend
else addend = reloc.addend || Expression.make_signed(original_word, 32)
end
case reloc.type
when 'NONE'
when 'GLOB_DAT', 'JMP_SLOT'
target = reloc_target(reloc)
target = Expression[target, :+, addend] if addend and addend != 0
else
puts "W: Elf: unhandled SH reloc #{reloc.inspect}" if $VERBOSE
target = nil
end
Metasm::Relocation.new(Expression[target], :u32, @endianness) if target
end
class DwarfDebug
# decode a DWARF2 'compilation unit'
def decode(elf, info, abbrev, str)
@@ -843,13 +749,12 @@ class ELF
end
# decodes the ELF dynamic tags, interpret them, and decodes symbols and relocs
def decode_segments_dynamic(decode_relocs=true)
def decode_segments_dynamic
return if not dynamic = @segments.find { |s| s.type == 'DYNAMIC' }
@encoded.ptr = add_label('dynamic_tags', dynamic.vaddr)
decode_tags
decode_segments_tags_interpret
decode_segments_symbols
return if not decode_relocs
decode_segments_relocs
decode_segments_relocs_interpret
end
@@ -878,7 +783,6 @@ class ELF
# decodes sections, interprets symbols/relocs, fills sections.encoded
def decode_sections
@symbols.clear # the NULL symbol is explicit in the symbol table
decode_sections_symbols
decode_sections_relocs
@sections.each { |s|
@@ -900,7 +804,7 @@ class ELF
end
def decode_exports
decode_segments_dynamic(false)
decode_segments_dynamic
end
# decodes the elf header, and depending on the elf type, decode segments or sections
@@ -915,14 +819,12 @@ class ELF
def each_section
@segments.each { |s| yield s.encoded, s.vaddr if s.type == 'LOAD' }
return if @header.type != 'REL'
return if @header.type != 'REL'
@sections.each { |s|
next if not s.encoded
if not l = s.encoded.inv_export[0] or l != s.name.tr('^a-zA-Z0-9_', '_')
l = new_label(s.name)
s.encoded.add_export l, 0
end
yield s.encoded, l
l = new_label(s.name)
s.encoded.add_export l, 0
yield s.encoded, l
}
end
@@ -931,10 +833,9 @@ class ELF
case @header.machine
when 'X86_64'; X86_64.new
when '386'; Ia32.new
when 'MIPS'; (@header.flags.include?('32BITMODE') ? MIPS64 : MIPS).new @endianness
when 'MIPS'; MIPS.new @endianness
when 'PPC'; PPC.new
when 'ARM'; ARM.new
when 'SH'; Sh4.new
else raise "unsupported cpu #{@header.machine}"
end
end
@@ -1011,13 +912,6 @@ EOC
(d.address_binding[s.value] ||= {})[:$t9] ||= Expression[s.value]
}
d.function[:default] = @cpu.disassembler_default_func
when 'sh4'
noret = DecodedFunction.new
noret.noreturn = true
%w[__stack_chk_fail abort exit].each { |fn|
d.function[Expression[fn]] = noret
}
d.function[:default] = @cpu.disassembler_default_func
end
d
end
+20 -111
View File
@@ -47,8 +47,8 @@ class ELF
# defines the @name_p field from @name and elf.section[elf.header.shstrndx]
# creates .shstrtab if needed
def make_name_p elf
return 0 if not name or @name == '' or elf.header.shnum == 0
if elf.header.shstrndx.to_i == 0 or not elf.sections[elf.header.shstrndx]
return 0 if not name or @name == ''
if elf.header.shstrndx.to_i == 0
sn = Section.new
sn.name = '.shstrtab'
sn.type = 'STRTAB'
@@ -140,9 +140,6 @@ class ELF
srank = rank[s]
nexts = @sections.find { |sec| rank[sec] > srank } # find section with rank superior
nexts = nexts ? @sections.index(nexts) : -1 # if none, last
if @header.shstrndx.to_i != 0 and nexts != -1 and @header.shstrndx >= nexts
@header.shstrndx += 1
end
@sections.insert(nexts, s) # insert section
end
@@ -199,8 +196,6 @@ class ELF
# encodes the symbol dynamic hash table in the .hash section, updates the HASH tag
def encode_hash
return if @symbols.length <= 1
if not hash = @sections.find { |s| s.type == 'HASH' }
hash = Section.new
hash.name = '.hash'
@@ -241,8 +236,6 @@ class ELF
# encodes the symbol table
# should have a stable self.sections array (only append allowed after this step)
def encode_segments_symbols(strtab)
return if @symbols.length <= 1
if not dynsym = @sections.find { |s| s.type == 'DYNSYM' }
dynsym = Section.new
dynsym.name = '.dynsym'
@@ -268,7 +261,7 @@ class ELF
# encodes the relocation tables
# needs a complete self.symbols array
def encode_segments_relocs
return if not @relocations or @relocations.empty?
return if not @relocations
arch_preencode_reloc_func = "arch_#{@header.machine.downcase}_preencode_reloc"
send arch_preencode_reloc_func if respond_to? arch_preencode_reloc_func
@@ -344,8 +337,6 @@ class ELF
# creates the .plt/.got from the @relocations
def arch_386_preencode_reloc
return if @relocations.empty?
# if .got.plt does not exist, the dynamic loader segfaults
if not gotplt = @sections.find { |s| s.type == 'PROGBITS' and s.name == '.got.plt' }
gotplt = Section.new
@@ -367,7 +358,7 @@ class ELF
when 'PC32'
next if not r.symbol
if r.symbol.type != 'FUNC'
if r.symbol.type != 'FUNC'
# external data xref: generate a GOT entry
# XXX reuse .got.plt ?
if not got ||= @sections.find { |s| s.type == 'PROGBITS' and s.name == '.got' }
@@ -394,7 +385,7 @@ class ELF
else
@relocations.delete r
end
# prevoffset is label_section_start + int_section_offset
target_s = @sections.find { |s| s.encoded and s.encoded.export[prevoffset.lexpr] == 0 }
rel = target_s.encoded.reloc[prevoffset.rexpr]
@@ -420,12 +411,12 @@ class ELF
#
# [.got.plt header]
# dd _DYNAMIC
# dd 0 # rewritten to GOTPLT? by ld-linux
# dd 0 # rewritten to GOTPLT? by ld-linux
# dd 0 # rewritten to dlresolve_inplace by ld-linux
#
# [.got.plt + func_got_offset]
# dd some_func_got_default # lazily rewritten to the real addr of some_func by jmp dlresolve_inplace
# # base_relocated ?
# # base_relocated ?
# in the PIC case, _dlresolve imposes us to use the ebx register (which may not be saved by the calling function..)
# also geteip trashes eax, which may interfere with regparm(3)
@@ -540,7 +531,7 @@ class ELF
# fill these later, but create the base relocs now
arch_create_reloc_func = "arch_#{@header.machine.downcase}_create_reloc"
next if not respond_to?(arch_create_reloc_func)
next if not respond_to?(arch_create_reloc_func)
curaddr = label_at(@encoded, 0, 'elf_start')
fkbind = {}
@sections.each { |s|
@@ -566,9 +557,6 @@ class ELF
encode_check_section_size strtab
# rm unused tag (shrink .nointerp binaries by allowing to skip the section entirely)
@tag.delete('STRTAB') if strtab.encoded.length == 1
# XXX any order needed ?
@tag.keys.each { |k|
case k
@@ -593,7 +581,7 @@ class ELF
encode_tag[k, @tag[k]]
end
}
encode_tag['NULL', @tag['NULL'] || 0] unless @tag.empty?
encode_tag['NULL', @tag['NULL'] || 0]
encode_check_section_size dynamic
end
@@ -610,13 +598,17 @@ class ELF
@sections.each { |s|
next if not s.encoded
s.encoded.reloc.each_value { |r|
et = r.target.externals
extern = et.find_all { |name| autoexports[name] }
next if extern.length != 1
symname = extern.first
t = Expression[r.target.reduce]
if t.op == :+ and t.rexpr.kind_of? Expression and t.rexpr.op == :- and not t.rexpr.lexpr and
t.rexpr.rexpr.kind_of?(::String) and t.lexpr.kind_of?(::String)
symname = t.lexpr
else
symname = t.reduce_rec
end
next if not dll = autoexports[symname]
if not @symbols.find { |sym| sym.name == symname }
@tag['NEEDED'] ||= []
@tag['NEEDED'] |= [autoexports[symname]]
@tag['NEEDED'] |= [dll]
sym = Symbol.new
sym.shndx = 'UNDEF'
sym.type = 'FUNC'
@@ -745,55 +737,6 @@ class ELF
@relocations << r
end
def arch_mips_create_reloc(section, off, binding, rel=nil)
rel ||= section.encoded.reloc[off]
startaddr = label_at(@encoded, 0)
r = Relocation.new
r.offset = Expression[label_at(section.encoded, 0, 'sect_start'), :+, off]
if Expression[rel.target, :-, startaddr].bind(binding).reduce.kind_of?(::Integer)
# this location is relative to the base load address of the ELF
r.type = 'REL32'
else
et = rel.target.externals
extern = et.find_all { |name| not binding[name] }
if extern.length != 1
puts "ELF: mips_create_reloc: ignoring reloc #{rel.target} in #{section.name}: #{extern.inspect} unknown" if $VERBOSE
return
end
if not sym = @symbols.find { |s| s.name == extern.first }
puts "ELF: mips_create_reloc: ignoring reloc #{rel.target} in #{section.name}: undefined symbol #{extern.first}" if $VERBOSE
return
end
r.symbol = sym
if Expression[rel.target, :-, sym.name].bind(binding).reduce.kind_of?(::Integer)
rel.target = Expression[rel.target, :-, sym.name]
r.type = '32'
elsif Expression[rel.target, :&, 0xffff0000].reduce.kind_of?(::Integer)
lo = Expression[rel.target, :&, 0xffff].reduce
lo = lo.lexpr if lo.kind_of?(Expression) and lo.op == :& and lo.rexpr == 0xffff
if lo.kind_of?(Expression) and lo.op == :>> and lo.rexpr == 16
r.type = 'HI16'
rel.target = Expression[rel.target, :&, 0xffff0000]
# XXX offset ?
elsif lo.kind_of?(String) or (lo.kind_of(Expression) and lo.op == :+)
r.type = 'LO16'
rel.target = Expression[rel.target, :&, 0xffff0000]
# XXX offset ?
else
puts "ELF: mips_create_reloc: ignoring reloc #{lo}: cannot find matching 16 reloc type" if $VERBOSE
return
end
#elsif Expression[rel.target, :+, label_at(section.encoded, 0)].bind(section.encoded.binding).reduce.kind_of? ::Integer
# rel.target = Expression[[rel.target, :+, label_at(section.encoded, 0)], :+, off]
# r.type = 'PC32'
else
puts "ELF: mips_create_reloc: ignoring reloc #{sym.name} + #{rel.target}: cannot find matching standard reloc type" if $VERBOSE
return
end
end
@relocations << r
end
# resets the fields of the elf headers that should be recalculated, eg phdr offset
def invalidate_header
@header.shoff = @header.shnum = nil
@@ -874,13 +817,9 @@ class ELF
end
if @header.type == 'REL'
encode_rel
else
encode_elf
raise 'ET_REL encoding not supported atm, come back later'
end
end
def encode_elf
@encoded = EncodedData.new
if @header.type != 'EXEC' or @segments.find { |i| i.type == 'INTERP' }
# create a .dynamic section unless we are an ET_EXEC with .nointerp
@@ -931,7 +870,7 @@ class ELF
end
# add dynamic segment
if ds = @sections.find { |sec| sec.type == 'DYNAMIC' } and ds.encoded.length > 1
if ds = @sections.find { |sec| sec.type == 'DYNAMIC' }
ds.set_default_values self
seg = Segment.new
seg.type = 'DYNAMIC'
@@ -1040,36 +979,6 @@ class ELF
@encoded.data
end
def encode_rel
@encoded = EncodedData.new
automagic_symbols
create_relocations
@header.phoff = @header.phnum = @header.phentsize = 0
@header.entry = 0
@sections.each { |sec| sec.addr = 0 }
st = @sections.inject(EncodedData.new) { |edata, sec| edata << sec.encode(self) }
binding = {}
@encoded << @header.encode(self)
@encoded.align 8
binding[@header.shoff] = @encoded.length
@encoded << st
@encoded.align 8
@sections.each { |sec|
next if not sec.encoded
binding[sec.offset] = @encoded.length
sec.encoded.fixup sec.encoded.binding
@encoded << sec.encoded
@encoded.align 8
}
@encoded.fixup! binding
@encoded.data
end
def parse_init
# allow the user to specify a section, falls back to .text if none specified
if not defined? @cursource or not @cursource
-64
View File
@@ -1,64 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/exe_format/main'
require 'metasm/encode'
require 'metasm/decode'
module Metasm
# GameBoy ROM file format
class GameBoyRom < ExeFormat
class Header < SerialStruct
# starts at 0x104 in the file
mem :logo, 0x30
str :title, 0x10
byte :sgb_flag
byte :cartridge_type
byte :rom_size # n => (n+1) * 32k bytes
byte :ram_size
byte :destination_code
byte :old_licensee_code
byte :mask_rom_version
byte :header_checksum
byte :checksum_hi
byte :checksum_lo
end
def encode_byte(val) Expression[val].encode(:u8, @endianness) end
def decode_byte(edata = @encoded) edata.decode_imm(:u8, @endianness) end
attr_accessor :header
def initialize(cpu=nil)
@endianness = (cpu ? cpu.endianness : :little)
super(cpu)
end
def decode_header
@encoded.ptr = 0x104
@header = Header.decode(self)
end
def decode
decode_header
@encoded.add_export('entrypoint', 0x100)
end
def cpu_from_headers
Z80.new('gb')
end
def each_section
yield @encoded, 0
end
def get_default_entrypoints
['entrypoint']
end
end
end
-421
View File
@@ -1,421 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/exe_format/main'
require 'metasm/encode'
require 'metasm/decode'
module Metasm
class JavaClass < ExeFormat
MAGIC = "\xCA\xFE\xBA\xBE"
CONSTANT_TAG = {0x1 => 'Utf8', 0x3 => 'Integer',
0x4 => 'Float', 0x5 => 'Long',
0x6 => 'Double', 0x7 => 'Class',
0x8 => 'String', 0x9 => 'Fieldref',
0xa => 'Methodref', 0xb => 'InterfaceMethodref',
0xc => 'NameAndType' }
class SerialStruct < Metasm::SerialStruct
new_int_field :u1, :u2, :u4
end
class Header < SerialStruct
mem :magic, 4, MAGIC
u2 :minor_version
u2 :major_version
end
class ConstantPool < SerialStruct
u2 :constant_pool_count
attr_accessor :constant_pool
def decode(c)
super(c)
@constant_pool = [nil]
i = 1
while i < @constant_pool_count
entry = ConstantPoolInfo.decode(c)
entry.idx = i
@constant_pool << entry
i += 1
if entry.tag =~ /Long|Double/
# we must insert a phantom cell
# for long and double constants
@constant_pool << nil
i += 1
end
end
end
def encode(c)
cp = super(c)
@constant_pool.each { |entry|
next if entry.nil?
cp << entry.encode(c)
}
cp
end
def [](idx)
@constant_pool[idx]
end
def []=(idx, val)
raise 'cannot be used to add a cp entry' if @constant_pool[idx].nil?
@constant_pool[idx] = val
end
end
class ConstantPoolInfo < SerialStruct
u1 :tag
fld_enum :tag, CONSTANT_TAG
attr_accessor :info, :idx
def decode(c)
super(c)
case @tag
when 'Utf8'
@info = ConstantUtf8.decode(c)
when /Integer|Float/
@info = ConstantIntFloat.decode(c)
when /Long|Double/
@info = ConstantLongDouble.decode(c)
when /Class|String/
@info = ConstantIndex.decode(c)
when /ref$/
@info = ConstantRef.decode(c)
when 'NameAndType'
@info = ConstantNameAndType.decode(c)
else
raise 'unkown constant tag'
return
end
end
def encode(c)
super(c) << @info.encode(c)
end
end
class ConstantUtf8 < SerialStruct
u2 :length
attr_accessor :bytes
def decode(c)
super(c)
@bytes = c.encoded.read(@length)
end
def encode(c)
super(c) << @bytes
end
end
class ConstantIntFloat < SerialStruct
u4 :bytes
end
class ConstantLongDouble < SerialStruct
u4 :high_bytes
u4 :low_bytes
end
class ConstantIndex < SerialStruct
u2 :index
end
class ConstantRef < SerialStruct
u2 :class_index
u2 :name_and_type_index
end
class ConstantNameAndType < SerialStruct
u2 :name_index
u2 :descriptor_index
end
class ClassInfo < SerialStruct
u2 :access_flags
u2 :this_class
u2 :super_class
end
class Interfaces < SerialStruct
u2 :interfaces_count
attr_accessor :interfaces
def decode(c)
super(c)
@interfaces = []
@interfaces_count.times {
@interfaces << ConstantIndex.decode(c)
}
end
def encode(c)
ret = super(c)
@interfaces.each { |e|
ret << e.encode(c)
}
ret
end
def [](idx)
@interfaces[idx]
end
end
class Fields < SerialStruct
u2 :fields_count
attr_accessor :fields
def decode(c)
super(c)
@fields = []
@fields_count.times {
@fields << FieldMethodInfo.decode(c)
}
end
def encode(c)
ret = super(c)
@fields.each { |e|
ret << e.encode(c)
}
ret
end
def [](idx)
@fields[idx]
end
end
class Methods < SerialStruct
u2 :methods_count
attr_accessor :methods
def decode(c)
super(c)
@methods = []
@methods_count.times {
@methods << FieldMethodInfo.decode(c)
}
end
def encode(c)
ret = super(c)
@methods.each { |e|
ret << e.encode(c)
}
ret
end
def [](idx)
@methods[idx]
end
end
class FieldMethodInfo < SerialStruct
u2 :access_flags
u2 :name_index
u2 :descriptor_index
attr_accessor :attributes
def decode(c)
super(c)
@attributes = Attributes.decode(c)
end
def encode(c)
super(c) << @attributes.encode(c)
end
end
class Attributes < SerialStruct
u2 :attributes_count
attr_accessor :attributes
def decode(c)
super(c)
@attributes = []
@attributes_count.times { |i|
@attributes << AttributeInfo.decode(c)
}
end
def encode(c)
ret = super(c)
@attributes.each { |e|
ret << e.encode(c)
}
ret
end
def [](idx)
@attributes[idx]
end
end
class AttributeInfo < SerialStruct
u2 :attribute_name_index
u4 :attribute_length
attr_accessor :data
def decode(c)
super(c)
@data = c.encoded.read(@attribute_length)
end
def encode(c)
super(c) << @data
end
end
def encode_u1(val) Expression[val].encode(:u8, @endianness) end
def encode_u2(val) Expression[val].encode(:u16, @endianness) end
def encode_u4(val) Expression[val].encode(:u32, @endianness) end
def decode_u1(edata = @encoded) edata.decode_imm(:u8, @endianness) end
def decode_u2(edata = @encoded) edata.decode_imm(:u16, @endianness) end
def decode_u4(edata = @encoded) edata.decode_imm(:u32, @endianness) end
attr_accessor :header, :constant_pool, :class_info, :interfaces, :fields, :methods, :attributes
def initialize(endianness=:big)
@endianness = endianness
@encoded = EncodedData.new
super()
end
def decode
@header = Header.decode(self)
@constant_pool = ConstantPool.decode(self)
@class_info = ClassInfo.decode(self)
@interfaces = Interfaces.decode(self)
@fields = Fields.decode(self)
@methods = Methods.decode(self)
@attributes = Attributes.decode(self)
end
def encode
@encoded = EncodedData.new
@encoded << @header.encode(self)
@encoded << @constant_pool.encode(self)
@encoded << @class_info.encode(self)
@encoded << @interfaces.encode(self)
@encoded << @fields.encode(self)
@encoded << @methods.encode(self)
@encoded << @attributes.encode(self)
@encoded.data
end
def cpu_from_headers
raise 'JVM'
end
def each_section
raise 'n/a'
end
def get_default_entrypoints
[]
end
def string_at(idx)
loop do
tmp = @constant_pool[idx].info
return tmp.bytes if tmp.kind_of? ConstantUtf8
idx = tmp.index
end
end
def decode_methodref(mref)
class_idx = mref.info.class_index
nt_idx = mref.info.name_and_type_index
name_idx = @constant_pool[nt_idx].info.name_index
desc_idx = @constant_pool[nt_idx].info.descriptor_index
string_at(class_idx) + '/' + string_at(name_idx) + string_at(desc_idx)
end
def cp_add(cpi, tag)
cpe = ConstantPoolInfo.new
cpe.tag = tag
cpe.info = cpi
cpe.idx = @constant_pool.constant_pool_count
@constant_pool.constant_pool << cpe
@constant_pool.constant_pool_count += 1
@constant_pool.constant_pool_count += 1 if tag =~ /Long|Double/
cpe.idx
end
def cp_find(tag)
constant_pool.constant_pool.each { |e|
next if !e or e.tag != tag
if yield(e.info)
return e.idx
end
}
nil
end
def cp_auto_utf8(string)
if idx = cp_find('Utf8') { |i| i.bytes == string }
return idx
end
cpi = ConstantUtf8.new
cpi.bytes = string
cpi.length = string.length
cp_add(cpi, 'Utf8')
end
def cp_auto_class(classname)
if idx = cp_find('Class') { |i| string_at(i.index) == classname }
return idx
end
cpi = ConstantIndex.new
cpi.index = cp_auto_utf8(classname)
cp_add(cpi, 'Class')
end
def cp_add_methodref(classname, name, descriptor)
nat = ConstantNameAndType.new
nat.name_index = cp_auto_utf8(name)
nat.descriptor_index = cp_auto_utf8(descriptor)
natidx = cp_add(nat, 'NameAndType')
cpi = ConstantRef.new
cpi.class_index = cp_auto_class(classname)
cpi.name_and_type_index = natidx
cp_add(cpi, 'Methodref')
end
def attribute_create(name, data)
a = AttributeInfo.new
a.attribute_name_index = cp_auto_utf8(name)
a.attribute_length = data.size
a.data = data
a
end
end
end
+17 -200
View File
@@ -17,9 +17,6 @@ class MachO < ExeFormat
MAGICS = [MAGIC, CIGAM, MAGIC64, CIGAM64]
# "a" != "a" lolz!
MAGICS.each { |s| s.force_encoding('BINARY') } if MAGIC.respond_to?(:force_encoding)
CPU = {
1 => 'VAX', 2 => 'ROMP',
4 => 'NS32032', 5 => 'NS32332',
@@ -47,7 +44,7 @@ class MachO < ExeFormat
3 => 'MMAX_APC_FPU', 4 => 'MMAX_APC_FPA', 5 => 'MMAX_XPC',
},
'I386' => { 3 => 'ALL', 4 => '486', 4+128 => '486SX',
0 => 'INTEL_MODEL_ALL', 10 => 'PENTIUM_4',
0 => 'INTEL_MODEL_ALL', 10 => 'PENTIUM_4',
5 => 'PENT', 0x16 => 'PENTPRO', 0x36 => 'PENTII_M3', 0x56 => 'PENTII_M5',
},
'MIPS' => { 0 => 'ALL', 1 => 'R2300', 2 => 'R2600', 3 => 'R2800', 4 => 'R2000a', },
@@ -55,7 +52,6 @@ class MachO < ExeFormat
'HPPA' => { 0 => 'ALL', 1 => '7100LC', },
'ARM' => { 0 => 'ALL', 1 => 'A500_ARCH', 2 => 'A500', 3 => 'A440',
4 => 'M4', 5 => 'A680', 6 => 'ARMV6', 9 => 'ARMV7',
11 => 'ARMV7S',
},
'MC88000' => { 0 => 'ALL', 1 => 'MC88100', 2 => 'MC88110', },
:wtf => { 0 => 'MC98000_ALL', 1 => 'MC98601', },
@@ -86,7 +82,7 @@ class MachO < ExeFormat
0x10 => 'PREBOUND', 0x20 => 'SPLIT_SEGS', 0x40 => 'LAZY_INIT', 0x80 => 'TWOLEVEL',
0x100 => 'FORCE_FLAT', 0x200 => 'NOMULTIDEFS', 0x400 => 'NOFIXPREBINDING', 0x800 => 'PREBINDABLE',
0x1000 => 'ALLMODSBOUND', 0x2000 => 'SUBSECTIONS_VIA_SYMBOLS', 0x4000 => 'CANONICAL', 0x8000 => 'WEAK_DEFINES',
0x10000 => 'BINDS_TO_WEAK', 0x20000 => 'ALLOW_STACK_EXECUTION', 0x200000 => 'MH_PIE',
0x10000 => 'BINDS_TO_WEAK', 0x20000 => 'ALLOW_STACK_EXECUTION',
}
SEG_PROT = { 1 => 'READ', 2 => 'WRITE', 4 => 'EXECUTE' }
@@ -100,13 +96,12 @@ class MachO < ExeFormat
0x15 => 'SUB_LIBRARY', 0x16 => 'TWOLEVEL_HINTS', 0x17 => 'PREBIND_CKSUM',
0x8000_0018 => 'LOAD_WEAK_DYLIB', 0x19 => 'SEGMENT_64', 0x1a => 'ROUTINES_64',
0x1b => 'UUID', 0x8000_001c => 'RPATH', 0x1d => 'CODE_SIGNATURE_PTR', 0x1e => 'CODE_SEGMENT_SPLIT_INFO',
0x21 => 'ENCRYPTION_INFO',
0x8000_001f => 'REEXPORT_DYLIB',
#0x8000_0000 => 'REQ_DYLD',
}
THREAD_FLAVOR = {
'POWERPC' => {
'POWERPC' => {
1 => 'THREAD_STATE',
2 => 'FLOAT_STATE',
3 => 'EXCEPTION_STATE',
@@ -131,15 +126,6 @@ class MachO < ExeFormat
SYM_SCOPE = { 0 => 'LOCAL', 1 => 'GLOBAL' }
SYM_TYPE = { 0 => 'UNDF', 2/2 => 'ABS', 0xa/2 => 'INDR', 0xe/2 => 'SECT', 0x1e/2 => 'TYPE' }
SYM_STAB = { }
IND_SYM_IDX = { 0x4000_0000 => 'INDIRECT_SYMBOL_ABS', 0x8000_0000 => 'INDIRECT_SYMBOL_LOCAL' }
GENERIC_RELOC = { 0 => 'VANILLA', 1 => 'PAIR', 2 => 'SECTDIFF', 3 => 'LOCAL_SECTDIFF', 4 => 'PB_LA_PTR' }
SEC_TYPE = {
0 => 'REGULAR', 1 => 'ZEROFILL', 2 => 'CSTRING_LITERALS', 3 => '4BYTE_LITERALS',
4 => '8BYTE_LITERALS', 5 => 'LITERAL_POINTERS', 6 => 'NON_LAZY_SYMBOL_POINTERS',
7 => 'LAZY_SYMBOL_POINTERS', 8 => 'SYMBOL_STUBS', 9 => 'MOD_INIT_FUNC_POINTERS'
}
class SerialStruct < Metasm::SerialStruct
new_int_field :xword
@@ -194,7 +180,7 @@ class MachO < ExeFormat
def decode(m)
super(m)
ptr = m.encoded.ptr
if @cmd.kind_of?(String) and self.class.constants.map { |c| c.to_s }.include?(@cmd)
if @cmd.kind_of? String and self.class.constants.map { |c| c.to_s }.include? @cmd
@data = self.class.const_get(@cmd).decode(m)
end
m.encoded.ptr = ptr + @cmdsize - 8
@@ -207,7 +193,7 @@ class MachO < ExeFormat
end
def encode(m)
ed = super(m)
ed = super(m)
ed << @data.encode(m) if @data
ed.align(m.size >> 3)
ed.fixup! @cmdsize => ed.length if @cmdsize.kind_of? String
@@ -257,10 +243,7 @@ class MachO < ExeFormat
str :name, 16
str :segname, 16
xwords :addr, :size
words :offset, :align, :reloff, :nreloc
bitfield :word, 0 => :type, 8 => :attributes_sys, 24 => :attributes_usr
words :res1, :res2
fld_enum :type, SEC_TYPE
words :offset, :align, :reloff, :nreloc, :flags, :res1, :res2
attr_accessor :res3 # word 64bit only
attr_accessor :segment, :encoded
@@ -275,6 +258,10 @@ class MachO < ExeFormat
# addr, offset, etc = @segment.virtaddr + 42
super(m)
end
def decode_inner(m)
@encoded = m.encoded[m.addr_to_off(@addr), @size]
end
end
SECTION_64 = SECTION
@@ -292,7 +279,7 @@ class MachO < ExeFormat
words :flavor, :count
fld_enum(:flavor) { |m, t| THREAD_FLAVOR[m.header.cputype] || {} }
attr_accessor :ctx
def entrypoint(m)
@ctx ||= {}
case m.header.cputype
@@ -359,7 +346,6 @@ class MachO < ExeFormat
end
LOAD_DYLIB = DYLIB
ID_DYLIB = DYLIB
LOAD_WEAK_DYLIB = DYLIB
class PREBOUND_DYLIB < STRING
word :stroff
@@ -370,10 +356,6 @@ class MachO < ExeFormat
LOAD_DYLINKER = STRING
ID_DYLINKER = STRING
class ENCRYPTION_INFO < SerialStruct
words :cryptoff, :cryptsize, :cryptid
end
class ROUTINES < SerialStruct
xwords :init_addr, :init_module, :res1, :res2, :res3, :res4, :res5, :res6
end
@@ -406,7 +388,7 @@ class MachO < ExeFormat
end
end
class CODE_SIGNATURE < SerialStruct
class CODE_SIGNATURE < SerialStruct
word :magic
word :size
word :count
@@ -497,12 +479,6 @@ class MachO < ExeFormat
end
end
class Relocation < SerialStruct
word :address
bitfield :word, 0 => :symbolnum, 24 => :pcrel, 25 => :length, 27 => :extern, 28 => :type
fld_enum :type, GENERIC_RELOC
end
def encode_byte(val) Expression[val].encode( :u8, @endianness) end
def encode_half(val) Expression[val].encode(:u16, @endianness) end
def encode_word(val) Expression[val].encode(:u32, @endianness) end
@@ -518,7 +494,6 @@ class MachO < ExeFormat
attr_accessor :segments
attr_accessor :commands
attr_accessor :symbols
attr_accessor :relocs
def initialize(cpu=nil)
super(cpu)
@@ -548,35 +523,6 @@ class MachO < ExeFormat
decode_relocations
end
# return the segment containing address, set seg.encoded.ptr to the correct offset
def segment_at(addr)
return if not addr or addr <= 0
if seg = @segments.find { |seg_| addr >= seg_.virtaddr and addr < seg_.virtaddr + seg_.virtsize }
seg.encoded.ptr = addr - seg.virtaddr
seg
end
end
def addr_to_fileoff(addr)
s = @segments.find { |s_| s_.virtaddr <= addr and s_.virtaddr + s_.virtsize > addr } if addr
addr - s.virtaddr + s.fileoff if s
end
def fileoff_to_addr(foff)
if s = @segments.find { |s_| s_.fileoff <= foff and s_.fileoff + s_.filesize > foff }
s.virtaddr + module_address + foff - s.fileoff
end
end
def module_address
@segments.map { |s_| s_.virtaddr }.min || 0
end
def module_size
return 0 if not sz = @segments.map { |s_| s_.virtaddr + s_.virtsize }.max
sz - module_address
end
def decode_symbols
@symbols = []
ep_count = 0
@@ -591,159 +537,30 @@ class MachO < ExeFormat
when 'THREAD', 'UNIXTHREAD'
ep_count += 1
ep = cmd.data.entrypoint(self)
next if not seg = segment_at(ep)
seg.encoded.add_export("entrypoint#{"_#{ep_count}" if ep_count >= 2 }")
next if not seg = @segments.find { |seg_| ep >= seg_.virtaddr and ep < seg_.virtaddr + seg_.virtsize }
seg.encoded.add_export("entrypoint#{"_#{ep_count}" if ep_count >= 2 }", ep - seg.virtaddr)
end
}
@symbols.each { |s|
next if s.value == 0 or not s.name
next if not seg = segment_at(s.value)
seg.encoded.add_export(s.name)
next if not seg = @segments.find { |seg_| s.value >= seg_.virtaddr and s.value < seg_.virtaddr + seg_.virtsize }
seg.encoded.add_export(s.name, s.value - seg.virtaddr)
}
end
def decode_relocations
@relocs = []
indsymtab = []
@commands.each { |cmd|
if cmd.cmd == 'DYSYMTAB'
@encoded.ptr = cmd.data.extreloff
cmd.data.nextrel.times { @relocs << Relocation.decode(self) }
@encoded.ptr = cmd.data.locreloff
cmd.data.nlocrel.times { @relocs << Relocation.decode(self) }
@encoded.ptr = cmd.data.indirectsymoff
cmd.data.nindirectsyms.times { indsymtab << decode_word }
end
}
@segments.each { |seg|
seg.sections.each { |sec|
@encoded.ptr = sec.reloff
sec.nreloc.times { @relocs << Relocation.decode(self) }
case sec.type
when 'NON_LAZY_SYMBOL_POINTERS', 'LAZY_SYMBOL_POINTERS'
edata = seg.encoded
off = sec.offset - seg.fileoff
(sec.size / 4).times { |i|
sidx = indsymtab[sec.res1+i]
case IND_SYM_IDX[sidx]
when 'INDIRECT_SYMBOL_LOCAL' # base reloc: add delta from prefered image base
edata.ptr = off
addr = decode_word(edata)
if s = segment_at(addr)
label = label_at(s.encoded, s.encoded.ptr, "xref_#{Expression[addr]}")
seg.encoded.reloc[off] = Metasm::Relocation.new(Expression[label], :u32, @endianness)
end
when 'INDIRECT_SYMBOL_ABS' # nothing
else
sym = @symbols[sidx]
seg.encoded.reloc[off] = Metasm::Relocation.new(Expression[sym.name], :u32, @endianness)
end
off += 4
}
when 'SYMBOL_STUBS'
# TODO next unless arch == 386 and sec.attrs & SELF_MODIFYING_CODE and sec.res2 == 5
edata = seg.encoded
edata.data = edata.data.to_str.dup
off = sec.offset - seg.fileoff + 1
(sec.size / 5).times { |i|
sidx = indsymtab[sec.res1+i]
case IND_SYM_IDX[sidx]
when 'INDIRECT_SYMBOL_LOCAL' # base reloc: add delta from prefered image base
edata.ptr = off
addr = decode_word(edata)
if s = segment_at(addr)
label = label_at(s.encoded, s.encoded.ptr, "xref_#{Expression[addr]}")
seg.encoded.reloc[off] = Metasm::Relocation.new(Expression[label, :-, Expression[seg.virtaddr, :+, off+4].reduce], :u32, @endianness)
end
when 'INDIRECT_SYMBOL_ABS' # nothing
else
seg.encoded[off-1] = 0xe9
sym = @symbols[sidx]
seg.encoded.reloc[off] = Metasm::Relocation.new(Expression[sym.name, :-, Expression[seg.virtaddr, :+, off+4].reduce], :u32, @endianness)
end
off += 5
}
end
}
}
seg = nil
@relocs.each { |r|
if r.extern == 1
sym = @symbols[r.symbolnum]
seg = @segments.find { |sg| sg.virtaddr <= r.address and sg.virtaddr + sg.virtsize > r.address } unless seg and seg.virtaddr <= r.address and seg.virtaddr + seg.virtsize > r.address
if not seg
puts "macho: reloc to unmapped space #{r.inspect} #{sym.inspect}" if $VERBOSE
next
end
seg.encoded.reloc[r.address - seg.virtaddr] = Metasm::Relocation.new(Expression[sym.name], :u32, @endianness)
end
}
end
def decode_segment(s)
@encoded.add_export(s.name, s.fileoff)
s.encoded = @encoded[s.fileoff, s.filesize]
s.encoded.virtsize = s.virtsize
s.sections.each { |ss|
ss.encoded = @encoded[ss.offset, ss.size]
s.encoded.add_export(ss.name, ss.offset - s.fileoff)
}
s.sections.each { |ss| ss.encoded = @encoded[ss.offset, ss.size] }
end
def each_section(&b)
@segments.each { |s| yield s.encoded, s.virtaddr }
end
def section_info
ret = []
@segments.each { |seg|
ret.concat seg.sections.map { |s| [s.name, s.addr, s.size, s.type] }
}
ret
end
def init_disassembler
d = super()
case @cpu.shortname
when 'ia32', 'x64'
old_cp = d.c_parser
d.c_parser = nil
d.parse_c <<EOC
void *dlsym(int, char *); // has special callback
// standard noreturn, optimized by gcc
void __attribute__((noreturn)) exit(int);
void abort(void) __attribute__((noreturn));
EOC
d.function[Expression['_dlsym']] = d.function[Expression['dlsym']] = dls = @cpu.decode_c_function_prototype(d.c_parser, 'dlsym')
d.function[Expression['_exit']] = d.function[Expression['exit']] = @cpu.decode_c_function_prototype(d.c_parser, 'exit')
d.function[Expression['abort']] = @cpu.decode_c_function_prototype(d.c_parser, 'abort')
d.c_parser = old_cp
dls.btbind_callback = lambda { |dasm, bind, funcaddr, calladdr, expr, origin, maxdepth|
sz = @cpu.size/8
raise 'dlsym call error' if not dasm.decoded[calladdr]
if @cpu.shortname == 'x64'
arg2 = :rsi
else
arg2 = Indirection.new(Expression[:esp, :+, 2*sz], sz, calladdr)
end
fnaddr = dasm.backtrace(arg2, calladdr, :include_start => true, :maxdepth => maxdepth)
if fnaddr.kind_of?(::Array) and fnaddr.length == 1 and s = dasm.decode_strz(fnaddr.first, 64) and s.length > sz
bind = bind.merge @cpu.register_symbols[0] => Expression[s]
end
bind
}
df = d.function[:default] = @cpu.disassembler_default_func
df.backtrace_binding[@cpu.register_symbols[4]] = Expression[@cpu.register_symbols[4], :+, @cpu.size/8]
df.btbind_callback = nil
end
d
end
def get_default_entrypoints
@commands.find_all { |cmd| cmd.cmd == 'THREAD' or cmd.cmd == 'UNIXTHREAD' }.map { |cmd| cmd.data.entrypoint(self) }
end
+3 -26
View File
@@ -16,7 +16,7 @@ class ExeFormat
# creates a new instance, populates self.encoded with the supplied string
def self.load(str, *a, &b)
e = new(*a, &b)
if str.kind_of?(EncodedData); e.encoded = str
if str.kind_of? EncodedData; e.encoded = str
else e.encoded << str
end
e
@@ -63,30 +63,6 @@ class ExeFormat
e
end
def load(str)
if str.kind_of?(EncodedData); @encoded = str
else @encoded << str
end
self
end
def load_file(path)
@filename ||= path
load(VirtualFile.read(path))
end
def decode_file(path)
load_file(path)
decode
self
end
def decode_file_header(path)
load_file(path)
decode_header
self
end
# creates a new object using the specified cpu, parses the asm source, and assemble
def self.assemble(cpu, source, file='<unk>', lineno=1)
source, cpu = cpu, source if source.kind_of? CPU
@@ -199,8 +175,9 @@ class ExeFormat
end
# saves the result of +encode_string+ in the specified file
# overwrites existing files
# fails if the file already exists
def encode_file(path, *a)
#raise Errno::EEXIST, path if File.exist? path # race, but cannot use O_EXCL, as O_BINARY is not defined in ruby
encode_string(*a)
File.open(path, 'wb') { |fd| fd.write(@encoded.data) }
end
+4 -4
View File
@@ -30,7 +30,7 @@ class NDS < ExeFormat
mem :secareadisable, 8
words :endoff, :headersz
mem :reserved4, 56
mem :ninlogo, 156
mem :ninlogo, 156
half :logoCRC, 0xcf56
half :headerCRC
end
@@ -75,9 +75,9 @@ class NDS < ExeFormat
attr_accessor :header, :icon, :arm9, :arm7
attr_accessor :files, :fat
def initialize(cpu=nil)
@endianness = (cpu ? cpu.endianness : :little)
super(cpu)
def initialize(endianness=:little)
@endianness = endianness
@encoded = EncodedData.new
end
# decodes the header from the current offset in self.encoded
+8 -11
View File
@@ -215,7 +215,7 @@ EOS
# TODO seh prototype (args => context)
# TODO hook on (non)resolution of :w xref
def get_xrefs_x(dasm, di)
if @cpu.shortname =~ /^ia32|^x64/ and a = di.instruction.args.first and a.kind_of?(Ia32::ModRM) and a.seg and a.seg.val == 4 and
if @cpu.shortname =~ /ia32|x64/ and a = di.instruction.args.first and a.kind_of? Ia32::ModRM and a.seg and a.seg.val == 4 and
w = get_xrefs_rw(dasm, di).find { |type, ptr, len| type == :w and ptr.externals.include? 'segment_base_fs' } and
dasm.backtrace(Expression[w[1], :-, 'segment_base_fs'], di.address).to_a.include?(Expression[0])
sehptr = w[1]
@@ -225,7 +225,7 @@ EOS
puts "backtrace seh from #{di} => #{a.map { |addr| Expression[addr] }.join(', ')}" if $VERBOSE
a.each { |aa|
next if aa == Expression::Unknown
dasm.auto_label_at(aa, 'seh', 'loc', 'sub')
l = dasm.auto_label_at(aa, 'seh', 'loc', 'sub')
dasm.addrs_todo << [aa]
}
super(dasm, di)
@@ -243,19 +243,17 @@ EOS
old_cp = d.c_parser
d.c_parser = nil
d.parse_c '__stdcall void *GetProcAddress(int, char *);'
d.parse_c '__stdcall void ExitProcess(int) __attribute__((noreturn));'
d.c_parser.lexer.define_weak('__MS_X86_64_ABI__') if @cpu.shortname == 'x64'
d.c_parser.lexer.define_weak('__MS_X86_64_ABI__') if @cpu.kind_of? X86_64
gpa = @cpu.decode_c_function_prototype(d.c_parser, 'GetProcAddress')
epr = @cpu.decode_c_function_prototype(d.c_parser, 'ExitProcess')
d.c_parser = old_cp
d.parse_c ''
d.c_parser.lexer.define_weak('__MS_X86_64_ABI__') if @cpu.shortname == 'x64'
d.c_parser.lexer.define_weak('__MS_X86_64_ABI__') if @cpu.kind_of? X86_64
@getprocaddr_unknown = []
gpa.btbind_callback = lambda { |dasm, bind, funcaddr, calladdr, expr, origin, maxdepth|
break bind if @getprocaddr_unknown.include? [dasm, calladdr] or not Expression[expr].externals.include? :eax
sz = @cpu.size/8
break bind if not dasm.decoded[calladdr]
if @cpu.shortname == 'x64'
if @cpu.kind_of? X86_64
arg2 = :rdx
else
arg2 = Indirection[[:esp, :+, 2*sz], sz, calladdr]
@@ -270,7 +268,6 @@ EOS
bind
}
d.function[Expression['GetProcAddress']] = gpa
d.function[Expression['ExitProcess']] = epr
d.function[:default] = @cpu.disassembler_default_func
end
d
@@ -315,7 +312,7 @@ class LoadedPE < PE
# reads a loaded PE from memory, returns a PE object
# dumps the header, optheader and all sections ; try to rebuild IAT (#memdump_imports)
def self.memdump(memory, baseaddr, entrypoint=nil, iat_p=nil)
def self.memdump(memory, baseaddr, entrypoint = nil, iat_p=nil)
loaded = LoadedPE.load memory[baseaddr, 0x1000_0000]
loaded.load_address = baseaddr
loaded.decode
@@ -375,6 +372,7 @@ class LoadedPE < PE
else
# read imported pointer from the import structure
while not ptr = imports.first.iat.shift
load_dll = nil
imports.shift
break if imports.empty?
iat_p = imports.first.iat_p
@@ -417,7 +415,6 @@ class LoadedPE < PE
puts 'unknown ptr %x' % ptr if $DEBUG
# allow holes in the unk_iat_p table
break if not unk_iat_p or failcnt > 4
loaded_dll = nil
failcnt += 1
next
end
@@ -425,7 +422,7 @@ class LoadedPE < PE
end
# dumped last importdirectory is correct, append the import field
i = ImportDirectory::Import.new
i = ImportDirectory::Import.new
if e.name
puts e.name if $DEBUG
i.name = e.name
-164
View File
@@ -1,164 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/exe_format/main'
require 'metasm/encode'
require 'metasm/decode'
module Metasm
# Python preparsed module (.pyc)
class PYC < ExeFormat
# 1 magic per python version...
# file = MAGIC(u16) \r \n timestamp(u32) data
MAGICS = [
62211 # 62211 = python2.7a0
]
class Header < SerialStruct
half :version
half :rn
word :timestamp
end
def decode_half(edata=@encoded) edata.decode_imm(:u16, @endianness) end
def decode_word(edata=@encoded) edata.decode_imm(:u32, @endianness) end
def decode_long(edata=@encoded) edata.decode_imm(:i32, @endianness) end
# file header
attr_accessor :header
# the marshalled object
attr_accessor :root
# list of all code objects
attr_accessor :all_code
def initialize()
@endianness = :little
@encoded = EncodedData.new
super()
end
def decode_header
@header = Header.decode(self)
end
def decode_pymarshal
case c = @encoded.read(1)
when '0' # NULL
:null
when 'N' # None
nil
when 'F' # False
false
when 'T' # True
true
#when 'S' # stopiter TODO
#when '.' # ellipsis TODO
when 'i' # long (i32)
decode_long
when 'I' # long (i64)
decode_word | (decode_long << 32)
when 'f' # float (ascii)
@encoded.read(@encoded.read(1).unpack('C').first).to_f
when 'g' # float (binary)
@encoded.read(8).unpack('d').first # XXX check
when 'x' # complex (f f)
{ :type => :complex,
:real => @encoded.read(@encoded.read(1).unpack('C').first).to_f,
:imag => @encoded.read(@encoded.read(1).unpack('C').first).to_f }
when 'y' # complex (g g)
{ :type => :complex,
:real => @encoded.read(8).unpack('d').first,
:imag => @encoded.read(8).unpack('d').first }
when 'l' # long (i32?)
decode_long
when 's' # string: len (long), data
@encoded.read(decode_long)
when 't' # 'interned': string with possible backreference later
s = @encoded.read(decode_long)
@references << s
s
when 'R' # stringref (see 't')
@references[decode_long]
when '(' # tuple (frozen Array): length l*objs
obj = []
decode_long.times { obj << decode_pymarshal }
obj
when '[' # list (Array)
obj = []
decode_long.times { obj << decode_pymarshal }
obj
when '{' # dict (Hash)
obj = {}
loop do
k = decode_pymarshal
break if k == :null
obj[k] = decode_pymarshal
end
{ :type => hash, :hash => obj } # XXX to avoid confusion with code, etc
when 'c' # code
# XXX format varies with version (header.signature)
obj = {}
obj[:type] = :code
obj[:argcount] = decode_long
#obj[:kwonly_argcount] = decode_long # not in py2.7
obj[:nlocals] = decode_long
obj[:stacksize] = decode_long
obj[:flags] = decode_long # TODO bit-decode this one
obj[:fileoff] = @encoded.ptr + 5 # XXX assume :code is a 's'
obj[:code] = decode_pymarshal
obj[:consts] = decode_pymarshal
obj[:names] = decode_pymarshal
obj[:varnames] = decode_pymarshal
obj[:freevars] = decode_pymarshal
obj[:cellvars] = decode_pymarshal
obj[:filename] = decode_pymarshal
obj[:name] = decode_pymarshal
obj[:firstlineno] = decode_long
obj[:lnotab] = decode_pymarshal
@all_code << obj
obj
when 'u' # unicode
@encoded.read(decode_long)
#when '?' # unknown TODO
#when '<' # set TODO
#when '>' # set (frozen) TODO
else
raise "unsupported python marshal #{c.inspect}"
end
end
def decode
decode_header
@all_code = []
@references = []
@root = decode_pymarshal
@references = nil
end
def cpu_from_headers
Python.new(self)
end
def each_section
yield @encoded, 0
end
def get_default_entrypoints
if @root.kind_of? Hash and @root[:type] == :code
[@root[:fileoff]]
else
[]
end
end
# return the :code part which contains off
def code_at_off(off)
@all_code.find { |c| c[:fileoff] <= off and c[:fileoff] + c[:code].length > off }
end
end
end
+2 -10
View File
@@ -46,13 +46,6 @@ class << self
# standard fields:
# virtual field, handled explicitly in a custom encode/decode
def virtual(*a)
a.each { |f|
new_field(f, nil, nil, nil)
}
end
# a fixed-size memory chunk
def mem(name, len, defval='')
new_field(name, lambda { |exe, me| exe.curencoded.read(len) }, lambda { |exe, me, val| val[0, len].ljust(len, 0.chr) }, defval)
@@ -66,7 +59,7 @@ class << self
# 0-terminated string
def strz(name, defval='')
d = lambda { |exe, me|
ed = exe.curencoded
ed = exe.curencoded
ed.read(ed.data.index(?\0, ed.ptr)-ed.ptr+1).chop
}
e = lambda { |exe, me, val| val + 0.chr }
@@ -114,7 +107,7 @@ class << self
d = lambda { |exe, me| (@bitfield_val >> off) & mask }
# update the temp var with the field value, return nil
e = lambda { |exe, me, val| @bitfield_val |= (val & mask) << off ; nil }
new_field(name, d, e, 0)
new_field(name, d, e, 0)
}
# free the temp var
@@ -130,7 +123,6 @@ class << self
# inject a hook to be run during the decoding process
def decode_hook(before=nil, &b)
@@fields[self] ||= []
idx = (before ? @@fields[self].index(fld_get(before)) : -1)
@@fields[self].insert(idx, [nil, b])
end
+3 -7
View File
@@ -69,11 +69,11 @@ class Shellcode < ExeFormat
parse(*a) if not a.empty?
@encoded << assemble_sequence(@source, @cpu)
@source.clear
self
encode
end
def encode(binding={})
@encoded.fixup! binding if binding.kind_of? Hash
@encoded.fixup! binding
@encoded.fixup @encoded.binding(@base_addr)
@encoded.fill @encoded.rawsize
self
@@ -107,11 +107,7 @@ class Shellcode < ExeFormat
# returns a virtual subclass of Shellcode whose cpu_from_headers will return cpu
def self.withcpu(cpu)
c = Class.new(self)
c.send(:define_method, :cpu_from_headers) {
cpu = Metasm.const_get(cpu) if cpu.kind_of?(::String)
cpu = cpu.new if cpu.kind_of?(::Class) and cpu.ancestors.include?(CPU)
cpu
}
c.send(:define_method, :cpu_from_headers) { cpu }
c
end
end
@@ -1,114 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/exe_format/main'
module Metasm
# Similar to Shellcode, with distinct sections per memory permission (R / RW / RX)
# encoding-side only
class Shellcode_RWX < ExeFormat
# the array of source elements (Instr/Data etc)
attr_accessor :source_r, :source_w, :source_x
# base address per section
attr_accessor :base_r, :base_w, :base_x
# encodeddata
attr_accessor :encoded_r, :encoded_w, :encoded_x
def initialize(cpu=nil)
@base_r = @base_w = @base_x = nil
@encoded_r = EncodedData.new
@encoded_w = EncodedData.new
@encoded_x = EncodedData.new
super(cpu)
end
def parse_init
@source_r = []
@source_w = []
@source_x = []
@cursource = @source_x
super()
end
# allows definition of the base address
def parse_parser_instruction(instr)
case instr.raw.downcase
when '.base', '.baseaddr', '.base_addr'
# ".base_addr <expression>"
# expression should #reduce to integer
@lexer.skip_space
raise instr, 'syntax error' if not base = Expression.parse(@lexer).reduce
raise instr, 'syntax error' if tok = @lexer.nexttok and tok.type != :eol
if @cursource.equal?(@source_r)
@base_r = base
elsif @cursource.equal?(@source_w)
@base_w = base
elsif @cursource.equal?(@source_x)
@base_x = base
else raise instr, "Where am I ?"
end
when '.rdata', '.rodata'
@cursource = @source_r
when '.data', '.bss'
@cursource = @source_w
when '.text'
@cursource = @source_x
else super(instr)
end
end
# encodes the source found in self.source
# appends it to self.encoded
# clears self.source
# the optional parameter may contain a binding used to fixup! self.encoded
# uses self.base_addr if it exists
def assemble(*a)
parse(*a) if not a.empty?
@encoded_r << assemble_sequence(@source_r, @cpu); @source_r.clear
@encoded_w << assemble_sequence(@source_w, @cpu); @source_w.clear
@encoded_x << assemble_sequence(@source_x, @cpu); @source_x.clear
self
end
def encode(binding={})
bd = {}
bd.update @encoded_r.binding(@base_r)
bd.update @encoded_w.binding(@base_w)
bd.update @encoded_x.binding(@base_x)
bd.update binding if binding.kind_of?(Hash)
@encoded_r.fixup bd
@encoded_w.fixup bd
@encoded_x.fixup bd
self
end
alias fixup encode
# resolve inter-section xrefs, raise if unresolved relocations remain
# call this when you have assembled+allocated memory for every section
def fixup_check(base_r=nil, base_w=nil, base_x=nil, bd={})
if base_r.kind_of?(Hash)
bd = base_r
base_r = nil
end
@base_r = base_r if base_r
@base_w = base_w if base_w
@base_x = base_x if base_x
fixup bd
ed = EncodedData.new << @encoded_r << @encoded_w << @encoded_x
raise ["Unresolved relocations:", ed.reloc.map { |o, r| "#{r.target} " + (Backtrace.backtrace_str(r.backtrace) if r.backtrace).to_s }].join("\n") if not ed.reloc.empty?
self
end
def encode_string(*a)
encode(*a)
ed = EncodedData.new << @encoded_r << @encoded_w << @encoded_x
ed.fixup(ed.binding)
raise ["Unresolved relocations:", ed.reloc.map { |o, r| "#{r.target} " + (Backtrace.backtrace_str(r.backtrace) if r.backtrace).to_s }].join("\n") if not ed.reloc.empty?
ed.data
end
end
end
-200
View File
@@ -1,200 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/exe_format/main'
require 'metasm/encode'
require 'metasm/decode'
begin
require 'zlib'
rescue LoadError
end
module Metasm
class SWF < ExeFormat
attr_accessor :signature, :version, :header, :chunks
CHUNK_TYPE = {
0 => 'End', 1 => 'ShowFrame', 2 => 'DefineShape', 3 => 'FreeCharacter',
4 => 'PlaceObject', 5 => 'RemoveObject', 6 => 'DefineBits', 7 => 'DefineButton',
8 => 'JPEGTables', 9 => 'SetBackgroundColor', 10 => 'DefineFont', 11 => 'DefineText',
12 => 'DoAction', 13 => 'DefineFontInfo', 14 => 'DefineSound', 15 => 'StartSound',
16 => 'StopSound', 17 => 'DefineButtonSound', 18 => 'SoundStreamHead', 19 => 'SoundStreamBlock',
20 => 'DefineBitsLossless', 21 => 'DefineBitsJPEG2', 22 => 'DefineShape2', 23 => 'DefineButtonCxform',
24 => 'Protect', 25 => 'PathsArePostScript', 26 => 'PlaceObject2',
28 => 'RemoveObject2', 29 => 'SyncFrame', 31 => 'FreeAll',
32 => 'DefineShape3', 33 => 'DefineText2', 34 => 'DefineButton2', 35 => 'DefineBitsJPEG3',
36 => 'DefineBitsLossless2', 37 => 'DefineEditText', 38 => 'DefineVideo', 39 => 'DefineSprite',
40 => 'NameCharacter', 41 => 'ProductInfo', 42 => 'DefineTextFormat', 43 => 'FrameLabel',
44 => 'DefineBehavior', 45 => 'SoundStreamHead2', 46 => 'DefineMorphShape', 47 => 'FrameTag',
48 => 'DefineFont2', 49 => 'GenCommand', 50 => 'DefineCommandObj', 51 => 'CharacterSet',
52 => 'FontRef', 53 => 'DefineFunction', 54 => 'PlaceFunction', 55 => 'GenTagObject',
56 => 'ExportAssets', 57 => 'ImportAssets', 58 => 'EnableDebugger', 59 => 'DoInitAction',
60 => 'DefineVideoStream', 61 => 'VideoFrame', 62 => 'DefineFontInfo2', 63 => 'DebugID',
64 => 'EnableDebugger2', 65 => 'ScriptLimits', 66 => 'SetTabIndex', 67 => 'DefineShape4',
68 => 'DefineMorphShape2', 69 => 'FileAttributes', 70 => 'PlaceObject3', 71 => 'ImportAssets2',
72 => 'DoABC', 76 => 'SymbolClass', 82 => 'DoABC2',
}
class SerialStruct < Metasm::SerialStruct
new_int_field :u8, :u16, :u32, :f16, :f32
end
class Rectangle < SerialStruct
virtual :nbits, :xmin, :xmax, :ymin, :ymax
def decode(swf)
byte = swf.decode_u8
bleft = 3
@nbits = byte >> bleft
@xmin, @xmax, @ymin, @ymax = (0..3).map {
nb = @nbits
v = 0
while nb > bleft
nb -= bleft
v |= (byte & ((1<<bleft)-1)) << nb
bleft = 8
byte = swf.decode_u8
end
v |= (byte >> (bleft-nb)) & ((1<<nb)-1)
bleft -= nb
Expression.make_signed(v, @nbits)
}
end
def set_default_values(swf)
@xmin ||= 0
@xmax ||= 31
@ymin ||= 0
@ymax ||= 31
@nbits = (0..30).find { |nb|
[@xmin, @xmax, @ymin, @ymax].all? { |v|
if nb == 0
v == 0
elsif v >= 0
# reserve sign bit
(v >> (nb-1)) == 0
else
(v >> nb) == -1
end
} } || 31
end
def encode(swf)
ed = super(swf)
byte = @nbits << 3
bleft = 3
[@xmin, @xmax, @ymin, @ymax].each { |v|
nb = @nbits
while nb > bleft
byte |= (v >> (nb-bleft)) & ((1<<bleft)-1)
nb -= bleft
ed << byte
byte = 0
bleft = 8
end
byte |= (v & ((1<<nb)-1)) << (bleft-nb)
bleft -= nb
}
ed << byte if bleft < 8
ed
end
end
class Header < SerialStruct
virtual :view
u16 :framerate # XXX bigendian...
u16 :framecount
def bswap_framerate(swf)
@framerate = ((@framerate >> 8) & 0xff) | ((@framerate & 0xff) << 8) if swf.endianness == :little
end
def decode(swf)
@view = Rectangle.decode(swf)
super(swf)
bswap_framerate(swf)
end
def encode(swf)
ed = @view.encode(swf)
bswap_framerate(swf)
ed << super(swf)
bswap_framerate(swf)
ed
end
end
class Chunk < SerialStruct
bitfield :u16, 0 => :length_, 6 => :tag
fld_enum :tag, CHUNK_TYPE
attr_accessor :data
def decode(swf)
super(swf)
@length = (@length_ == 0x3f ? swf.decode_u32 : @length_)
@data = swf.encoded.read(@length)
end
def set_default_values(swf)
@length = @data.length
@length_ = [@length, 0x3f].min
end
def encode(swf)
super(swf) <<
(swf.encode_u32(@length) if @length >= 0x3f) <<
@data
end
end
def decode_u8( edata=@encoded) edata.decode_imm(:u8, @endianness) end
def decode_u16(edata=@encoded) edata.decode_imm(:u16, @endianness) end
def decode_u32(edata=@encoded) edata.decode_imm(:u32, @endianness) end
def decode_f16(edata=@encoded) edata.decode_imm(:i16, @endianness)/256.0 end
def decode_f32(edata=@encoded) edata.decode_imm(:i32, @endianness)/65536.0 end
def encode_u8(w) Expression[w].encode(:u8, @endianness) end
def encode_u16(w) Expression[w].encode(:u16, @endianness) end
def encode_u32(w) Expression[w].encode(:u32, @endianness) end
def encode_f16(w) Expression[(w*256).to_i].encode(:u16, @endianness) end
def encode_f32(w) Expression[(w*65536).to_i].encode(:u32, @endianness) end
attr_accessor :endianness
def initialize(cpu = nil)
@endianness = :little
@header = Header.new
@chunks = []
super(cpu)
end
def decode_header
@signature = @encoded.read(3)
@version = decode_u8
@data_length = decode_u32
case @signature
when 'FWS'
when 'CWS'
# data_length = uncompressed data length
data = @encoded.read(@encoded.length-8)
data = Zlib::Inflate.inflate(data)
@encoded = EncodedData.new(data)
else raise InvalidExeFormat, "Bad signature #{@signature.inspect}"
end
@data_length = [@data_length, @encoded.length].min
@header = Header.decode(self)
end
def decode
decode_header
while @encoded.ptr < @data_length
@chunks << Chunk.decode(self)
end
end
end
end
-333
View File
@@ -1,333 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2006-2009 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/exe_format/main'
require 'metasm/encode'
require 'metasm/decode'
begin
require 'zlib'
rescue LoadError
end
# generic ZIP file, may be an APK or JAR
# supports only a trivial subset of the whole ZIP specification
# single file archive
# deflate or no compression
# no encryption
# 32bit offsets/sizes
module Metasm
class ZIP < ExeFormat
MAGIC_LOCALHEADER = 0x04034b50
COMPRESSION_METHOD = { 0 => 'NONE', 1 => 'SHRUNK', 2 => 'REDUCE1', 3 => 'REDUCE2',
4 => 'REDUCE3', 5 => 'REDUCE4', 6 => 'IMPLODE', 7 => 'TOKENIZED',
8 => 'DEFLATE', 9 => 'DEFLATE64', 10 => 'OLDTERSE', 12 => 'BZIP2', 14 => 'LZMA',
18 => 'TERSE', 19 => 'LZ77', 97 => 'WAVPACK', 98 => 'PPMD' }
# zip file format:
#
# [local header 1]
# compressed data 1
#
# [local header 2]
# compressed data 2
#
# [central header 1]
# [central header 2]
#
# [end of central directory]
class LocalHeader < SerialStruct
word :signature, MAGIC_LOCALHEADER
half :verneed, 10
half :flags # bit 3 => has data descriptor following the compressed data
half :compress_method, 0, COMPRESSION_METHOD
halfs :mtime, :mdate
word :crc32
words :compressed_sz, :uncompressed_sz
halfs :fname_len, :extra_len
attr_accessor :fname, :extra
attr_accessor :compressed_off
def decode(zip)
super(zip)
raise "Invalid ZIP signature #{@signature.to_s(16)}" if @signature != MAGIC_LOCALHEADER
@fname = zip.encoded.read(@fname_len) if @fname_len > 0
@extra = zip.encoded.read(@extra_len) if @extra_len > 0
@compressed_off = zip.encoded.ptr
end
def set_default_values(zip)
@fname_len = fname ? @fname.length : 0
@extra_len = extra ? @extra.length : 0
super(zip)
end
def encode(zip)
ed = super(zip)
ed << fname << extra
end
# return a new LocalHeader with all fields copied from a CentralHeader
def self.from_central(f)
l = new
l.verneed = f.verneed
l.flags = f.flags
l.compress_method = f.compress_method
l.mtime = f.mtime
l.mdate = f.mdate
l.crc32 = f.crc32
l.compressed_sz = f.compressed_sz
l.uncompressed_sz = f.uncompressed_sz
l.fname = f.fname
l.extra = f.extra
l
end
end
MAGIC_CENTRALHEADER = 0x02014b50
class CentralHeader < SerialStruct
word :signature, MAGIC_CENTRALHEADER
half :vermade, 10
half :verneed, 10
half :flags
half :compress_method, 0, COMPRESSION_METHOD
halfs :mtime, :mdate
word :crc32
words :compressed_sz, :uncompressed_sz
halfs :fname_len, :extra_len, :comment_len
half :disk_nr
half :file_attr_intern
word :file_attr_extern
word :localhdr_off
attr_accessor :fname, :extra, :comment
attr_accessor :data
def decode(zip)
super(zip)
raise "Invalid ZIP signature #{@signature.to_s(16)}" if @signature != MAGIC_CENTRALHEADER
@fname = zip.encoded.read(@fname_len) if @fname_len > 0
@extra = zip.encoded.read(@extra_len) if @extra_len > 0
@comment = zip.encoded.read(@comment_len) if @comment_len > 0
end
def set_default_values(zip)
@fname_len = fname ? @fname.length : 0
@extra_len = extra ? @extra.length : 0
@comment_len = comment ? @comment.length : 0
super(zip)
end
def encode(zip)
ed = super(zip)
ed << fname << extra << comment
end
# reads the raw file data from the archive
def file_data(zip)
return @data if data
zip.encoded.ptr = @localhdr_off
LocalHeader.decode(zip)
raw = zip.encoded.read(@compressed_sz)
@data = case @compress_method
when 'NONE'
raw
when 'DEFLATE'
z = Zlib::Inflate.new(-Zlib::MAX_WBITS)
z.inflate(raw)
else
raise "Unsupported zip compress method #@compress_method"
end
end
def zlib_deflate(data, level=Zlib::DEFAULT_COMPRESSION)
z = Zlib::Deflate.new(level, -Zlib::MAX_WBITS)
z.deflate(data) + z.finish
end
# encode the data, fixup related fields
def encode_data(zip)
data = file_data(zip)
@compress_method = 'NONE' if data == ''
@crc32 = Zlib.crc32(data)
@uncompressed_sz = data.length
case compress_method
when 'NONE'
when 'DEFLATE'
data = zlib_deflate(data)
when nil
# autodetect compression method
# compress if we win more than 10% space
cdata = zlib_deflate(data)
ratio = cdata.length * 100 / data.length
if ratio < 90
@compress_method = 'DEFLATE'
data = cdata
else
@compress_method = 'NONE'
end
end
@compressed_sz = data.length
data
end
end
MAGIC_ENDCENTRALDIRECTORY = 0x06054b50
class EndCentralDirectory < SerialStruct
word :signature, MAGIC_ENDCENTRALDIRECTORY
halfs :disk_nr, :disk_centraldir, :entries_nr_thisdisk, :entries_nr
word :directory_sz
word :directory_off
half :comment_len
attr_accessor :comment
def decode(zip)
super(zip)
raise "Invalid ZIP end signature #{@signature.to_s(16)}" if @signature != MAGIC_ENDCENTRALDIRECTORY
@comment = zip.encoded.read(@comment_len) if @comment_len > 0
end
def set_default_values(zip)
@entries_nr_thisdisk = zip.files.length
@entries_nr = zip.files.length
@comment_len = comment ? @comment.length : 0
super(zip)
end
def encode(zip)
ed = super(zip)
ed << comment
end
end
def decode_half(edata=@encoded) edata.decode_imm(:u16, @endianness) end
def decode_word(edata=@encoded) edata.decode_imm(:u32, @endianness) end
def encode_half(w) Expression[w].encode(:u16, @endianness) end
def encode_word(w) Expression[w].encode(:u32, @endianness) end
attr_accessor :files, :header
def initialize(cpu = nil)
@endianness = :little
@header = EndCentralDirectory.new
@files = []
super(cpu)
end
# scan and decode the 'end of central directory' header
def decode_header
if not @encoded.ptr = @encoded.data.rindex([MAGIC_ENDCENTRALDIRECTORY].pack('V'))
raise "ZIP: no end of central directory record"
end
@header = EndCentralDirectory.decode(self)
end
# read the whole central directory file descriptors
def decode
decode_header
@encoded.ptr = @header.directory_off
while @encoded.ptr < @header.directory_off + @header.directory_sz
@files << CentralHeader.decode(self)
end
end
# checks if a given file name exists in the archive
# returns the CentralHeader or nil
# case-insensitive if lcase is false
def has_file(fname, lcase=true)
decode if @files.empty?
if lcase
@files.find { |f| f.fname == fname }
else
fname = fname.downcase
@files.find { |f| f.fname.downcase == fname }
end
end
# returns the uncompressed raw file content from a given name
# nil if name not found
# case-insensitive if lcase is false
def file_data(fname, lcase=true)
if f = has_file(fname, lcase)
f.file_data(self)
end
end
# add a new file to the zip archive
def add_file(fname, data, compress=:auto)
f = CentralHeader.new
case compress
when 'NONE', false; f.compress_method = 'NONE'
when 'DEFLATE', true; f.compress_method = 'DEFLATE'
end
f.fname = fname
f.data = data
@files << f
f
end
# create a new zip file
def encode
edata = EncodedData.new
central_dir = EncodedData.new
@files.each { |f|
encode_entry(f, edata, central_dir)
}
@header.directory_off = edata.length
@header.directory_sz = central_dir.length
edata << central_dir << @header.encode(self)
@encoded = edata
end
# add one file to the zip stream
def encode_entry(f, edata, central_dir)
f.localhdr_off = edata.length
# may autodetect compression method
raw = f.encode_data(self)
zipalign(f, edata)
central_dir << f.encode(self) # calls f.set_default_values
l = LocalHeader.from_central(f)
edata << l.encode(self)
edata << raw
end
# zipalign: ensure uncompressed data starts on a 4-aligned offset
def zipalign(f, edata)
if f.compress_method == 'NONE' and not f.extra
o = (edata.length + f.fname.length + 2) & 3
f.extra = " "*(4-o) if o > 0
end
end
# when called as AutoExe, try to find a meaningful exefmt
def self.autoexe_load(bin)
z = decode(bin)
if dex = z.file_data('classes.dex')
puts "ZIP: APK file, loading 'classes.dex'" if $VERBOSE
AutoExe.load(dex)
else
z
end
end
end
end
+17 -7
View File
@@ -1,13 +1,23 @@
backend = ENV['METASM_GUI'] || (
backend = case ENV['METASM_GUI']
when 'gtk'; 'gtk'
when 'qt'; 'qt'
when 'win32'; 'win32'
else
puts "Unsupported METASM_GUI #{ENV['METASM_GUI'].inspect}" if $VERBOSE and ENV['METASM_GUI']
if RUBY_PLATFORM =~ /(i.86|x(86_)?64)-(mswin|mingw|cygwin)/i
'win32'
else
begin
require 'gtk2'
'gtk'
rescue LoadError
begin
require 'gtk2'
'gtk'
rescue LoadError
#begin
# require 'Qt4'
# 'qt'
#rescue LoadError
raise LoadError, 'No GUI ruby binding installed - please install libgtk2-ruby'
end
#end
end
)
end
end
require "metasm/gui/#{backend}"
+41 -35
View File
@@ -23,7 +23,8 @@ class CStructWidget < DrawableWidget
@cwidth = @cheight = 1 # widget size in chars
@structdepth = 2
@default_color_association = ColorTheme.merge :keyword => :blue
@default_color_association = { :text => :black, :keyword => :blue, :caret => :black,
:background => :white, :hl_word => :palered, :comment => :darkblue }
end
def click(x, y)
@@ -89,7 +90,19 @@ class CStructWidget < DrawableWidget
elsif cx < @view_x
else
t = t[(@view_x - cx + t.length)..-1] if cx-t.length < @view_x
draw_string_hl(c, x, y, t)
if @hl_word
stmp = t
pre_x = 0
while stmp =~ /^(.*?)(\b#{Regexp.escape @hl_word}\b)/
s1, s2 = $1, $2
pre_x += s1.length*@font_width
hl_w = s2.length*@font_width
draw_rectangle_color(:hl_word, x+pre_x, y, hl_w, @font_height)
pre_x += hl_w
stmp = stmp[s1.length+s2.length..-1]
end
end
draw_string_color(c, x, y, t)
x += t.length * @font_width
end
}
@@ -103,7 +116,7 @@ class CStructWidget < DrawableWidget
cy = (@caret_y-@view_y)*@font_height
draw_line_color(:caret, cx, cy, cx, cy+@font_height-1)
end
@oldcaret_x, @oldcaret_y = @caret_x, @caret_y
end
@@ -166,31 +179,28 @@ class CStructWidget < DrawableWidget
when ?l
liststructs
when ?t
inputbox('new struct name to use', :text => (@curstruct.name rescue '')) { |n| focus_struct_byname(n) }
inputbox('new struct name to use', :text => (@curstruct.name rescue '')) { |n|
lst = @dasm.c_parser.toplevel.struct.keys.grep(String)
if fn = lst.find { |ln| ln == n } || lst.find { |ln| ln.downcase == n.downcase }
focus_addr(@curaddr, @dasm.c_parser.toplevel.struct[fn])
else
lst = @dasm.c_parser.toplevel.symbol.keys.grep(String).find_all { |ln|
s = @dasm.c_parser.toplevel.symbol[ln]
s.kind_of?(C::TypeDef) and s.untypedef.kind_of?(C::Union)
}
if fn = lst.find { |ln| ln == n } || lst.find { |ln| ln.downcase == n.downcase }
focus_addr(@curaddr, @dasm.c_parser.toplevel.symbol[fn].untypedef)
else
liststructs(n)
end
end
}
else return false
end
true
end
# display the struct or pop a list of matching struct names if ambiguous
def focus_struct_byname(n, addr=@curaddr)
lst = @dasm.c_parser.toplevel.struct.keys.grep(String)
if fn = lst.find { |ln| ln == n } || lst.find { |ln| ln.downcase == n.downcase }
focus_addr(addr, @dasm.c_parser.toplevel.struct[fn])
else
lst = @dasm.c_parser.toplevel.symbol.keys.grep(String).find_all { |ln|
s = @dasm.c_parser.toplevel.symbol[ln]
s.kind_of?(C::TypeDef) and s.untypedef.kind_of?(C::Union)
}
if fn = lst.find { |ln| ln == n } || lst.find { |ln| ln.downcase == n.downcase }
focus_addr(addr, @dasm.c_parser.toplevel.symbol[fn].untypedef)
else
liststructs(n, addr)
end
end
end
def liststructs(partname=nil, addr=@curaddr)
def liststructs(partname=nil)
tl = @dasm.c_parser.toplevel
list = [['name', 'size']]
list += tl.struct.keys.grep(String).sort.map { |stn|
@@ -206,12 +216,12 @@ class CStructWidget < DrawableWidget
}.compact
if partname and list.length == 2
focus_addr(addr, tl.struct[list[1][0]] || tl.symbol[list[1][0]].untypedef)
focus_addr(@curaddr, tl.struct[list[1][0]] || tl.symbol[list[1][0]].untypedef)
return
end
listwindow('structs', list) { |stn|
focus_addr(addr, tl.struct[stn[0]] || tl.symbol[stn[0]].untypedef)
focus_addr(@curaddr, tl.struct[stn[0]] || tl.symbol[stn[0]].untypedef)
}
end
@@ -230,7 +240,7 @@ class CStructWidget < DrawableWidget
def update_caret
if @caret_x < @view_x or @caret_x >= @view_x + @cwidth or @caret_y < @view_y or @caret_y >= @view_y + @cheight
redraw
elsif update_hl_word(@line_text[@caret_y], @caret_x, :c)
elsif update_hl_word(@line_text[@caret_y], @caret_x)
redraw
else
invalidate_caret(@oldcaret_x-@view_x, @oldcaret_y-@view_y)
@@ -244,14 +254,9 @@ class CStructWidget < DrawableWidget
def focus_addr(addr, struct=@curstruct)
return if @parent_widget and not addr = @parent_widget.normalize(addr)
@curaddr = addr
@curstruct = struct
@caret_x = @caret_y = 0
if struct.kind_of? String
@curstruct = nil
focus_struct_byname(struct)
else
@curstruct = struct
gui_update
end
gui_update
true
end
@@ -273,7 +278,7 @@ class CStructWidget < DrawableWidget
@line_text_col << []
render[indent * [@structdepth - maxdepth, 0].max, :text]
}
if not obj
@line_text_col = [[]]
@line_dereference = []
@@ -303,6 +308,7 @@ class CStructWidget < DrawableWidget
elsif struct.kind_of?(C::Struct)
render["struct #{struct.name || '_'} st_#{Expression[@curaddr]} = ", :text] if not off
fldoff = struct.fldoffset
fbo = struct.fldbitoffset || {}
else
render["union #{struct.name || '_'} un_#{Expression[@curaddr]} = ", :text] if not off
end
@@ -357,7 +363,7 @@ class CStructWidget < DrawableWidget
else
@line_text_col = [[[:text, '/* no struct selected (list with "l") */']]]
end
@line_text = @line_text_col.map { |l| l.map { |c, s| s }.join }
update_caret
redraw
+11 -11
View File
@@ -19,15 +19,15 @@ class CoverageWidget < DrawableWidget
@section_x = []
@slave = nil # another dasmwidget whose curaddr is kept sync
@default_color_association = ColorTheme.merge :caret => :yellow, :caret_col => :darkyellow,
:background => :palegrey, :code => :red, :data => :blue
@default_color_association = { :caret => :yellow, :caret_col => :darkyellow,
:background => :palegrey, :code => :red, :data => :blue }
end
def click(x, y)
x, y = x.to_i - 1, y.to_i
@sections.zip(@section_x).each { |s, sx|
if x >= sx[0] and x < sx[1]+@pixel_w
@curaddr = s[0] + (x-sx[0])/@pixel_w*@byte_per_col + (y/@pixel_h-@spacing)*@byte_per_col/@col_height
@sections.zip(@section_x).each { |(a, l, seq), (sx, sxe)|
if x >= sx and x < sxe+@pixel_w
@curaddr = a + (x-sx)/@pixel_w*@byte_per_col + (y/@pixel_h-@spacing)*@byte_per_col/@col_height
@slave.focus_addr(@curaddr) if @slave rescue @slave=nil
redraw
break
@@ -125,13 +125,13 @@ class CoverageWidget < DrawableWidget
x += @spacing*@pixel_w
}
@sections.zip(@section_x).each { |s, sx|
next if @curaddr.kind_of? Integer and not s[0].kind_of? Integer
next if @curaddr.kind_of? Expression and not s[0].kind_of? Expression
co = @curaddr-s[0]
if co >= 0 and co < s[1]
@sections.zip(@section_x).each { |(a, l, seq), (sx, sxe)|
next if @curaddr.kind_of? Integer and not a.kind_of? Integer
next if @curaddr.kind_of? Expression and not a.kind_of? Expression
co = @curaddr-a
if co >= 0 and co < l
draw_color :caret_col
x = sx[0] + (co/@byte_per_col)*@pixel_w
x = sx + (co/@byte_per_col)*@pixel_w
draw_rect[-@spacing, -1, 1]
draw_rect[@col_height, @col_height+@spacing, 1]
draw_color :caret
+20 -7
View File
@@ -19,8 +19,9 @@ class CdecompListingWidget < DrawableWidget
@curaddr = nil
@tabwidth = 8
@default_color_association = ColorTheme.merge :keyword => :blue, :localvar => :darkred,
:globalvar => :darkgreen, :intrinsic => :darkyellow
@default_color_association = { :text => :black, :keyword => :blue, :caret => :black,
:background => :white, :hl_word => :palered, :localvar => :darkred,
:globalvar => :darkgreen, :intrinsic => :darkyellow }
end
def curfunc
@@ -90,7 +91,19 @@ class CdecompListingWidget < DrawableWidget
# must not include newline
render = lambda { |str, color|
# function ends when we write under the bottom of the listing
draw_string_hl(color, x, y, str)
if @hl_word
stmp = str
pre_x = 0
while stmp =~ /^(.*?)(\b#{Regexp.escape @hl_word}\b)/
s1, s2 = $1, $2
pre_x += s1.length*@font_width
hl_w = s2.length*@font_width
draw_rectangle_color(:hl_word, x+pre_x, y, hl_w, @font_height)
pre_x += hl_w
stmp = stmp[s1.length+s2.length..-1]
end
end
draw_string_color(color, x, y, str)
x += str.length * @font_width
}
@@ -115,7 +128,7 @@ class CdecompListingWidget < DrawableWidget
cy = (@caret_y-@view_y)*@font_height
draw_line_color(:caret, cx, cy, cx, cy+@font_height-1)
end
@oldcaret_x, @oldcaret_y = @caret_x, @caret_y
end
@@ -171,7 +184,7 @@ class CdecompListingWidget < DrawableWidget
f.decompdata[:stackoff_name][s.stackoff] = v if s.stackoff
elsif @dasm.c_parser.toplevel.symbol[n]
@dasm.rename_label(n, v)
@curaddr = v if @curaddr == n
@curaddr = v if @curaddr == n
end
gui_update
}
@@ -251,13 +264,13 @@ class CdecompListingWidget < DrawableWidget
invalidate_caret(@caret_x-@view_x, @caret_y-@view_y)
@oldcaret_x, @oldcaret_y = @caret_x, @caret_y
redraw if update_hl_word(@line_text[@caret_y], @caret_x, :c)
redraw if update_hl_word(@line_text[@caret_y], @caret_x)
end
# focus on addr
# returns true on success (address exists & decompiled)
def focus_addr(addr)
if @dasm.c_parser and (@dasm.c_parser.toplevel.symbol[addr] or @dasm.c_parser.toplevel.struct[addr].kind_of?(C::Union))
if @dasm.c_parser and (@dasm.c_parser.toplevel.symbol[addr] or @dasm.c_parser.toplevel.struct[addr])
@curaddr = addr
@caret_x = @caret_y = 0
gui_update
+534 -875
View File
@@ -22,7 +22,13 @@ class Graph
#def inspect ; puts caller ; "#{Expression[@id] rescue @id.inspect}" end
end
attr_accessor :id, :box, :box_id, :root_addrs, :view_x, :view_y, :keep_split
# TODO
class MergedBox
attr_accessor :id, :text, :x, :y, :w, :h
attr_accessor :to, :from
end
attr_accessor :id, :box, :root_addrs, :view_x, :view_y, :keep_split
def initialize(id)
@id = id
@root_addrs = []
@@ -33,597 +39,29 @@ class Graph
# empty @box
def clear
@box = []
@box_id = {}
end
# link the two boxes (by id)
def link_boxes(id1, id2)
raise "unknown index 1 #{id1}" if not b1 = @box_id[id1]
raise "unknown index 2 #{id2}" if not b2 = @box_id[id2]
raise "unknown index 1 #{id1}" if not b1 = @box.find { |b| b.id == id1 }
raise "unknown index 2 #{id2}" if not b2 = @box.find { |b| b.id == id2 }
b1.to |= [b2]
b2.from |= [b1]
end
# creates a new box, ensures id is not already taken
def new_box(id, content=nil)
raise "duplicate id #{id}" if @box_id[id]
raise "duplicate id #{id}" if @box.find { |b| b.id == id }
b = Box.new(id, content)
@box << b
@box_id[id] = b
b
end
# returns the [x1, y1, x2, y2] of the rectangle encompassing all boxes
def boundingbox
minx = @box.map { |b| b.x }.min.to_i
miny = @box.map { |b| b.y }.min.to_i
maxx = @box.map { |b| b.x + b.w }.max.to_i
maxy = @box.map { |b| b.y + b.h }.max.to_i
[minx, miny, maxx, maxy]
end
# a -> b -> c -> d (no other in/outs)
def pattern_layout_col(groups)
# find head
return if not head = groups.find { |g|
g.to.length == 1 and
g.to[0].from.length == 1 and
(g.from.length != 1 or g.from[0].to.length != 1)
}
# find full sequence
ar = [head]
while head.to.length == 1 and head.to[0].from.length == 1
head = head.to[0]
ar << head
end
# move boxes inside this group
maxw = ar.map { |g| g.w }.max
fullh = ar.inject(0) { |h, g| h + g.h }
cury = -fullh/2
ar.each { |g|
dy = cury - g.y
g.content.each { |b| b.y += dy }
cury += g.h
}
# create remplacement group
newg = Box.new(nil, ar.map { |g| g.content }.flatten)
newg.w = maxw
newg.h = fullh
newg.x = -newg.w/2
newg.y = -newg.h/2
newg.from = ar.first.from - ar
newg.to = ar.last.to - ar
# fix xrefs
newg.from.each { |g| g.to -= ar ; g.to << newg }
newg.to.each { |g| g.from -= ar ; g.from << newg }
# fix groups
groups[groups.index(head)] = newg
ar.each { |g| groups.delete g }
true
end
# if a group has no content close to its x/x+w borders, shrink it
def group_remove_hz_margin(g, maxw=16)
if g.content.empty?
g.x = -maxw/2 if g.x < -maxw/2
g.w = maxw if g.w > maxw
return
end
margin_left = g.content.map { |b| b.x }.min - g.x
margin_right = g.x+g.w - g.content.map { |b| b.x+b.w }.max
if margin_left + margin_right > maxw
g.w -= margin_left + margin_right - maxw
dx = (maxw/2 + margin_right - margin_left)/2
g.content.each { |b| b.x += dx }
g.x = -g.w/2
end
end
# a -> [b, c, d] -> e
def pattern_layout_line(groups)
# find head
ar = []
groups.each { |g|
if g.from.length == 1 and g.to.length <= 1 and g.from.first.to.length > 1
ar = g.from.first.to.find_all { |gg| gg.from == g.from and gg.to == g.to }
elsif g.from.empty? and g.to.length == 1 and g.to.first.from.length > 1
ar = g.to.first.from.find_all { |gg| gg.from == g.from and gg.to == g.to }
else ar = []
end
break if ar.length > 1
}
return if ar.length <= 1
ar.each { |g| group_remove_hz_margin(g) }
# move boxes inside this group
#ar = ar.sort_by { |g| -g.h }
maxh = ar.map { |g| g.h }.max
fullw = ar.inject(0) { |w, g| w + g.w }
curx = -fullw/2
ar.each { |g|
# if no to, put all boxes at bottom ; if no from, put them at top
case [g.from.length, g.to.length]
when [1, 0]; dy = (g.h - maxh)/2
when [0, 1]; dy = (maxh - g.h)/2
else dy = 0
end
dx = curx - g.x
g.content.each { |b| b.x += dx ; b.y += dy }
curx += g.w
}
# add a 'margin-top' proportionnal to the ar width
# this gap should be relative to the real boxes and not possible previous gaps when
# merging lines (eg long line + many if patterns -> dont duplicate gaps)
boxen = ar.map { |g| g.content }.flatten
realh = boxen.map { |g| g.y + g.h }.max - boxen.map { |g| g.y }.min
if maxh < realh + fullw/4
maxh = realh + fullw/4
end
# create remplacement group
newg = Box.new(nil, ar.map { |g| g.content }.flatten)
newg.w = fullw
newg.h = maxh
newg.x = -newg.w/2
newg.y = -newg.h/2
newg.from = ar.first.from
newg.to = ar.first.to
# fix xrefs
newg.from.each { |g| g.to -= ar ; g.to << newg }
newg.to.each { |g| g.from -= ar ; g.from << newg }
# fix groups
groups[groups.index(ar.first)] = newg
ar.each { |g| groups.delete g }
true
end
# a -> b -> c & a -> c
def pattern_layout_ifend(groups)
# find head
return if not head = groups.find { |g|
g.to.length == 2 and
((g.to[0].from.length == 1 and g.to[0].to.length == 1 and g.to[0].to[0] == g.to[1]) or
(g.to[1].from.length == 1 and g.to[1].to.length == 1 and g.to[1].to[0] == g.to[0]))
}
if head.to[0].to.include?(head.to[1])
ten = head.to[0]
else
ten = head.to[1]
end
# stuff 'then' inside the 'if'
# move 'if' up, 'then' down
head.content.each { |g| g.y -= ten.h/2 }
ten.content.each { |g| g.y += head.h/2 }
head.h += ten.h
head.y -= ten.h/2
# widen 'if'
# this adds a phantom left side
# drop existing margins first
group_remove_hz_margin(ten)
dw = ten.w - head.w/2
if dw > 0
# need to widen head to fit ten
head.w += 2*dw
head.x -= dw
end
# merge
ten.content.each { |g| g.x += -ten.x }
head.content.concat ten.content
head.to.delete ten
head.to[0].from.delete ten
groups.delete ten
true
end
def pattern_layout_complex(groups)
order = order_graph(groups)
uniq = nil
if groups.sort_by { |g| order[g] }.find { |g|
next if g.to.length <= 1
# list all nodes reachable for every 'to'
reach = g.to.map { |t| list_reachable(t) }
# list all nodes reachable only from a single 'to'
uniq = []
reach.each_with_index { |r, i|
# take all nodes reachable from there ...
u = uniq[i] = r.dup
u.delete_if { |k, v| k.content.empty? } # ignore previous layout_complex artifacts
reach.each_with_index { |rr, ii|
next if i == ii
# ... and delete nodes reachable from anywhere else
rr.each_key { |k| u.delete k }
}
}
uniq.delete_if { |u| u.length <= 1 }
!uniq.empty?
}
# now layout every uniq subgroup independently
uniq.each { |u|
subgroups = groups.find_all { |g| u[g] }
# isolate subgroup from external links
# change all external links into a single empty box
newtop = Box.new(nil, [])
newtop.x = -8 ; newtop.y = -9
newtop.w = 16 ; newtop.h = 18
newbot = Box.new(nil, [])
newbot.x = -8 ; newbot.y = -9
newbot.w = 16 ; newbot.h = 18
hadfrom = [] ; hadto = []
subgroups.each { |g|
g.to.dup.each { |t|
next if u[t]
newbot.from |= [g]
g.to.delete t
hadto << t
g.to |= [newbot]
}
g.from.dup.each { |f|
next if u[f]
newtop.to |= [g]
g.from.delete f
hadfrom << f
g.from |= [newtop]
}
}
subgroups << newtop << newbot
# subgroup layout
auto_arrange_step(subgroups) while subgroups.length > 1
newg = subgroups[0]
# patch 'groups'
idx = groups.index { |g| u[g] }
groups.delete_if { |g| u[g] }
groups[idx, 0] = [newg]
# restore external links & fix xrefs
hadfrom.uniq.each { |f|
f.to.delete_if { |t| u[t] }
f.to |= [newg]
newg.from |= [f]
}
hadto.uniq.each { |t|
t.from.delete_if { |f| u[f] }
t.from |= [newg]
newg.to |= [t]
}
}
true
end
end
# find the minimal set of nodes from which we can reach all others
# this is done *before* removing cycles in the graph
# returns the order (Hash group => group_order)
# roots have an order of 0
def order_graph(groups)
roots = groups.find_all { |g| g.from.empty? }
o = {} # tentative order
todo = []
loop do
roots.each { |g|
o[g] ||= 0
todo |= g.to.find_all { |gg| not o[gg] }
}
# order nodes from the tentative roots
until todo.empty?
n = todo.find { |g| g.from.all? { |gg| o[gg] } } || order_solve_cycle(todo, o)
todo.delete n
o[n] = n.from.map { |g| o[g] }.compact.max + 1
todo |= n.to.find_all { |g| not o[g] }
end
break if o.length >= groups.length
# pathological cases
if noroot = groups.find_all { |g| o[g] and g.from.find { |gg| not o[gg] } }.sort_by { |g| o[g] }.first
# we picked a root in the middle of the graph, walk up
todo |= noroot.from.find_all { |g| not o[g] }
until todo.empty?
n = todo.find { |g| g.to.all? { |gg| o[gg] } } ||
todo.sort_by { |g| g.to.map { |gg| o[gg] }.compact.min }.first
todo.delete n
o[n] = n.to.map { |g| o[g] }.compact.min - 1
todo |= n.from.find_all { |g| not o[g] }
end
# setup todo for next fwd iteration
todo |= groups.find_all { |g| not o[g] and g.from.find { |gg| o[gg] } }
else
# disjoint graph, start over from one other random node
roots << groups.find { |g| not o[g] }
end
end
if o.values.find { |rank| rank < 0 }
# did hit a pathological case, restart with found real roots
roots = groups.find_all { |g| not g.from.find { |gg| o[gg] < o[g] } }
o = {}
todo = []
roots.each { |g|
o[g] ||= 0
todo |= g.to.find_all { |gg| not o[gg] }
}
until todo.empty?
n = todo.find { |g| g.from.all? { |gg| o[gg] } } || order_solve_cycle(todo, o)
todo.delete n
o[n] = n.from.map { |g| o[g] }.compact.max + 1
todo |= n.to.find_all { |g| not o[g] }
end
# there's something screwy around here !
raise "moo" if o.length < groups.length
end
o
end
def order_solve_cycle(todo, o)
# 'todo' has no trivial candidate
# pick one node from todo which no other todo can reach
# exclude pathing through already ordered nodes
todo.find { |t1|
not todo.find { |t2| t1 != t2 and can_find_path(t2, t1, o.dup) }
} ||
# some cycle heads are mutually recursive
todo.sort_by { |t1|
# find the one who can reach the most others
[todo.find_all { |t2| t1 != t2 and can_find_path(t1, t2, o.dup) }.length,
# and with the highest rank
t1.from.map { |gg| o[gg] }.compact.max]
}.last
end
# checks if there is a path from src to dst avoiding stuff in 'done'
def can_find_path(src, dst, done={})
todo = [src]
while g = todo.pop
next if done[g]
return true if g == dst
done[g] = true
todo.concat g.to
end
false
end
# returns a hash with true for every node reachable from src (included)
def list_reachable(src, done={})
todo = [src]
while g = todo.pop
next if done[g]
done[g] = true
todo.concat g.to
end
done
end
# revert looping edges in groups
def make_tree(groups, order)
# now we have the roots and node orders
# revert cycling edges - o(chld) < o(parent)
order.each_key { |g|
g.to.dup.each { |gg|
if order[gg] < order[g]
# cycling edge, revert
g.to.delete gg
gg.from.delete g
g.from |= [gg]
gg.to |= [g]
end
}
}
end
# group groups in layers of same order
# create dummy groups along long edges so that no path exists between non-contiguous layers
def create_layers(groups, order)
newemptybox = lambda {
b = Box.new(nil, [])
b.x = -8
b.y = -9
b.w = 16
b.h = 18
groups << b
b
}
newboxo = {}
order.each_key { |g|
og = order[g] || newboxo[g]
g.to.dup.each { |gg|
ogg = order[gg] || newboxo[gg]
if ogg > og+1
# long edge, expand
sq = [g]
(ogg - 1 - og).times { |i| sq << newemptybox[] }
sq << gg
gg.from.delete g
g.to.delete gg
newboxo[g] ||= order[g]
sq.inject { |g1, g2|
g1.to |= [g2]
g2.from |= [g1]
newboxo[g2] = newboxo[g1]+1
g2
}
raise if newboxo[gg] != ogg
end
}
}
order.update newboxo
# layers[o] = [list of nodes of order o]
layers = []
groups.each { |g|
(layers[order[g]] ||= []) << g
}
layers
end
# take all groups, order them by order, layout as layers
# always return a single group holding everything
def layout_layers(groups)
order = order_graph(groups)
# already a tree
layers = create_layers(groups, order)
return if layers.empty?
layers.each { |l| l.each { |g| group_remove_hz_margin(g) } }
# widest layer width
maxlw = layers.map { |l| l.inject(0) { |ll, g| ll + g.w } }.max
# center the 1st layer boxes on a segment that large
x0 = -maxlw/2.0
curlw = layers[0].inject(0) { |ll, g| ll + g.w }
dx0 = (maxlw - curlw) / (2.0*layers[0].length)
layers[0].each { |g|
x0 += dx0
g.x = x0
x0 += g.w + dx0
}
# at this point, the goal is to reorder the most populated layer the best we can, and
# move other layers' boxes accordingly
layers[1..-1].each { |l|
# for each subsequent layer, reorder boxes based on their ties with the previous layer
i = 0
l.replace l.sort_by { |g|
# we know g.from is not empty (g would be in @layer[0])
medfrom = g.from.inject(0.0) { |mx, gg| mx + (gg.x + gg.w/2.0) } / g.from.length
# on ties, keep original order
[medfrom, i]
}
# now they are reordered, update their #x accordingly
# evenly distribute them in the layer
x0 = -maxlw/2.0
curlw = l.inject(0) { |ll, g| ll + g.w }
dx0 = (maxlw - curlw) / (2.0*l.length)
l.each { |g|
x0 += dx0
g.x = x0
x0 += g.w + dx0
}
}
layers[0...-1].reverse_each { |l|
# for each subsequent layer, reorder boxes based on their ties with the previous layer
i = 0
l.replace l.sort_by { |g|
if g.to.empty?
# TODO floating end
medfrom = 0
else
medfrom = g.to.inject(0.0) { |mx, gg| mx + (gg.x + gg.w/2.0) } / g.to.length
end
# on ties, keep original order
[medfrom, i]
}
# now they are reordered, update their #x accordingly
x0 = -maxlw/2.0
curlw = l.inject(0) { |ll, g| ll + g.w }
dx0 = (maxlw - curlw) / (2.0*l.length)
l.each { |g|
x0 += dx0
g.x = x0
x0 += g.w + dx0
}
}
# now the boxes are (hopefully) sorted correctly
# position them according to their ties with prev/next layer
# from the maxw layer (positionning = packed), propagate adjacent layers positions
maxidx = (0..layers.length).find { |i| l = layers[i] ; l.inject(0) { |ll, g| ll + g.w } == maxlw }
# list of layer indexes to walk
ilist = [maxidx]
ilist.concat((maxidx+1...layers.length).to_a) if maxidx < layers.length-1
ilist.concat((0..maxidx-1).to_a.reverse) if maxidx > 0
layerbox = []
ilist.each { |i|
l = layers[i]
curlw = l.inject(0) { |ll, g| ll + g.w }
# left/rightmost acceptable position for the current box w/o overflowing on the right side
minx = -maxlw/2.0
maxx = minx + (maxlw-curlw)
# replace whole layer with a box
newg = layerbox[i] = Box.new(nil, l.map { |g| g.content }.flatten)
newg.w = maxlw
newg.h = l.map { |g| g.h }.max
newg.x = -newg.w/2
newg.y = -newg.h/2
# dont care for from/to, we'll return a single box anyway
l.each { |g|
ref = (i < maxidx) ? g.to : g.from
# TODO elastic positionning around the ideal position
# (g and g+1 may have the same med, then center both on it)
if i == maxidx
nx = minx
elsif ref.empty?
nx = (minx+maxx)/2
else
# center on the outline of rx
# may want to center on rx center's center ?
rx = ref.sort_by { |gg| gg.x }
med = (rx.first.x + rx.last.x + rx.last.w - g.w) / 2.0
nx = [[med, minx].max, maxx].min
end
dx = nx+g.w/2
g.content.each { |b| b.x += dx }
minx = nx+g.w
maxx += g.w
}
}
newg = Box.new(nil, layerbox.map { |g| g.content }.flatten)
newg.w = layerbox.map { |g| g.w }.max
newg.h = layerbox.inject(0) { |h, g| h + g.h }
newg.x = -newg.w/2
newg.y = -newg.h/2
# vertical: just center each box on its layer
y0 = newg.y
layerbox.each { |lg|
lg.content.each { |b|
b.y += y0-lg.y
}
y0 += lg.h
}
groups.replace [newg]
end
# place boxes in a good-looking layout
# create artificial 'group' container for boxes, that will later be merged in geometrical patterns
def auto_arrange_init
# 'group' is an array of boxes
def auto_arrange_init(list=@box)
# groups is an array of box groups
# all groups are centered on the origin
h = {} # { box => group }
@groups = @box.map { |b|
@groups = list.map { |b|
b.x = -b.w/2
b.y = -b.h/2
g = Box.new(nil, [b])
@@ -631,7 +69,6 @@ class Graph
g.y = b.y - 9
g.w = b.w + 16
g.h = b.h + 18
h[b] = g
g
}
@@ -640,102 +77,395 @@ class Graph
# no self references
# a box is in one and only one group in 'groups'
@groups.each { |g|
g.to = g.content.first.to.map { |t| h[t] if t != g }.compact
g.from = g.content.first.from.map { |f| h[f] if f != g }.compact
g.to = g.content.first.to.map { |t| next if not t = list.index(t) ; @groups[t] }.compact - [g]
g.from = g.content.first.from.map { |f| next if not f = list.index(f) ; @groups[f] }.compact - [g]
}
# order boxes
order = order_graph(@groups)
# remove cycles from the graph
make_tree(@groups, order)
end
def auto_arrange_step(groups=@groups)
pattern_layout_col(groups) or pattern_layout_line(groups) or
pattern_layout_ifend(groups) or pattern_layout_complex(groups) or
layout_layers(groups)
end
def auto_arrange_post
auto_arrange_movebox
#auto_arrange_vertical_shrink
end
# actually move boxes inside the groups
def auto_arrange_movebox
@groups.each { |g|
dx = (g.x + g.w/2).to_i
dy = (g.y + g.h/2).to_i
g.content.each { |b|
b.x += dx
b.y += dy
}
}
end
def auto_arrange_vertical_shrink
# vertical shrink
# TODO stuff may shrink vertically more if we could move it slightly horizontally...
@box.sort_by { |b| b.y }.each { |b|
next if b.from.empty?
# move box up to its from, unless something blocks the way
min_y = b.from.map { |bb|
bb.y+bb.h
}.find_all { |by|
by <= b.y
}.max
moo = []
moo << 8*b.from.length
moo << 8*b.from[0].to.length
cx = b.x+b.w/2
moo << b.from.map { |bb| (cx - (bb.x+bb.w/2)).abs }.max / 10
cx = b.from[0].x+b.from[0].w/2
moo << b.from[0].to.map { |bb| (cx - (bb.x+bb.w/2)).abs }.max / 10
margin_y = 16 + moo.max
next if not min_y or b.y <= min_y + margin_y
blocking = @box.find_all { |bb|
next if bb == b
bb.y+bb.h > min_y and bb.y+bb.h < b.y and
bb.x-12 < b.x+b.w and bb.x+bb.w+12 > b.x
}
may_y = blocking.map { |bb| bb.y+bb.h } << min_y
do_y = may_y.sort.map { |by| by + margin_y }.find { |by|
# should not collision with b if moved to by+margin_y
not blocking.find { |bb|
bb.x-12 < b.x+b.w and bb.x+bb.w+12 > b.x and
bb.y-12 < by+b.h and bb.y+bb.h+12 > by
}
}
b.y = do_y if do_y < b.y
# no need to re-sort outer loop
}
# TODO
# energy-minimal positionning of boxes from this basic layout
# avoid arrow confusions
end
def auto_arrange_boxes
auto_arrange_init
nil while @groups.length > 1 and auto_arrange_step
auto_arrange_post
@groups = []
# walk from a box, fork at each multiple to, chop links to a previous box (loops etc)
@madetree = false
end
# gives a text representation of the current graph state
def dump_layout(groups=@groups)
groups.map { |g| "#{groups.index(g)} -> #{g.to.map { |t| groups.index(t) }.sort.inspect}" }
end
def auto_arrange_step
# TODO fix
# 0->[1, 2] 1->[3] 2->[3, 4] 3->[] 4->[1]
# push 0 jz l3 push 1 jz l4 push 2 l3: push 3 l4: hlt
# and more generally all non-looping graphs where this algo creates backward links
groups = @groups
return if groups.length <= 1
maketree = lambda { |roots|
next if @madetree
@madetree = true
maxdepth = {} # max arc count to reach this box from graph start (excl loop)
trim = lambda { |g, from|
# unlink g from (part of) its from
from.each { |gg| gg.to.delete g }
g.from -= from
}
walk = lambda { |g|
# score
parentdepth = g.from.map { |gg| maxdepth[gg] }
if parentdepth.empty?
# root
maxdepth[g] = 0
elsif parentdepth.include? nil
# not farthest parent found / loop
next
# elsif maxdepth[g] => ?
else
maxdepth[g] = parentdepth.max + 1
end
g.to.each { |gg| walk[gg] }
}
roots.each { |g| trim[g, g.from] unless g.from.empty? }
roots.each { |g| walk[g] }
# handle loops now (unmarked nodes)
while unmarked = groups - maxdepth.keys and not unmarked.empty?
if g = unmarked.find { |g_| g_.from.find { |gg| maxdepth[gg] } }
# loop head
trim[g, g.from.find_all { |gg| not maxdepth[gg] }] # XXX not quite sure for this
walk[g]
else
# disconnected subgraph
g = unmarked.find { |g_| g_.from.empty? } || unmarked.first
trim[g, g.from]
maxdepth[g] = 0
walk[g]
end
end
}
# concat all ary boxes into its 1st element, remove trailing groups from 'groups'
# updates from/to
merge_groups = lambda { |ary|
bg = Box.new(nil, [])
bg.x, bg.y = ary.map { |g| g.x }.min, ary.map { |g| g.y }.min
bg.w, bg.h = ary.map { |g| g.x+g.w }.max - bg.x, ary.map { |g| g.y+g.h }.max - bg.y
ary.each { |g|
bg.content.concat g.content
bg.to |= g.to
bg.from |= g.from
}
bg.to -= ary
bg.to.each { |t| t.from = t.from - ary + [bg] }
bg.from -= ary
bg.from.each { |f| f.to = f.to - ary + [bg] }
idx = ary.map { |g| groups.index(g) }.min
groups = @groups = groups - ary
groups.insert(idx, bg)
bg
}
# move all boxes within group of dx, dy
move_group = lambda { |g, dx, dy|
g.content.each { |b| b.x += dx ; b.y += dy }
g.x += dx ; g.y += dy
}
align_hz = lambda { |ary|
# if we have one of the block much bigger than the others, put it on the far right
big = ary.sort_by { |g| g.h }.last
if (ary-[big]).all? { |g| g.h < big.h/3 }
ary -= [big]
else
big = nil
end
nx = ary.map { |g| g.w }.inject(0) { |a, b| a+b } / -2
nx *= 2 if big and ary.length == 1 # just put the parent on the separation of the 2 child
ary.each { |g|
move_group[g, nx-g.x, 0]
nx += g.w
}
move_group[big, nx-big.x, 0] if big
}
align_vt = lambda { |ary|
ny = ary.map { |g| g.h }.inject(0) { |a, b| a+b } / -2
ary.each { |g|
move_group[g, 0, ny-g.y]
ny += g.h
}
}
# scan groups for a column pattern (head has 1 'to' which from == [head])
group_columns = lambda {
groups.find { |g|
next if g.from.length == 1 and g.from.first.to.length == 1
ary = [g]
ary << (g = g.to.first) while g.to.length == 1 and g.to.first.from.length == 1
next if ary.length <= 1
align_vt[ary]
merge_groups[ary]
true
}
}
# scan groups for a line pattern (multiple groups with same to & same from)
group_lines = lambda { |strict|
if groups.all? { |g1| g1.from.empty? and g1.to.empty? }
# disjoint subgraphs
align_hz[groups]
merge_groups[groups]
next true
end
groups.find { |g1|
ary = g1.from.map { |gg| gg.to }.flatten.uniq.find_all { |gg|
gg != g1 and
(gg.from - g1.from).empty? and (g1.from - gg.from).empty? and
(strict ? ((gg.to - g1.to).empty? and (g1.to - gg.to).empty?) : (g1.to & gg.to).first)
}
ary = g1.to.map { |gg| gg.from }.flatten.uniq.find_all { |gg|
gg != g1 and
(gg.to - g1.to).empty? and (g1.to - gg.to).empty? and
(strict ? ((gg.from - g1.from).empty? and (g1.from - gg.from).empty?) : (g1.from & gg.from).first)
} if ary.empty?
next if ary.empty?
ary << g1
dy = 16*ary.map { |g| g.to.length + g.from.length }.inject { |a, b| a+b }
ary.each { |g| g.h += dy ; g.y -= dy/2 }
align_hz[ary]
if ary.first.to.empty? # shrink graph if highly dissymetric and to.empty?
ah = ary.map { |g| g.h }.max
ary.each { |g|
move_group[g, 0, (g.h-ah)/2] # move up
next if not p = ary[ary.index(g)-1]
y = [g.y, p.y].min # shrink width
h = [g.h, p.h].min
xp = p.content.map { |b| b.x+b.w if b.y+b.h+8 >= y and b.y-8 <= y+h }.compact.max || p.x+p.w/2
xg = g.content.map { |b| b.x if b.y+b.h+8 >= y and b.y-8 <= y+h }.compact.min || g.x+g.w/2
dx = xg-xp-24
next if dx <= 0
ary.each { |gg|
dx = -dx if gg == g
move_group[gg, dx/2, 0]
}
if p.x+p.w > ary.last.x+ary.last.w or ary.first.x > g.x # fix broken centerism
x = [g.x, ary.first.x].min
xm = [p.x+p.w, ary.last.x+ary.last.w].max
ary.each { |gg| move_group[gg, (x+xm)/-2, 0] }
end
}
end
merge_groups[ary]
true
}
}
group_inv_if = {}
# scan groups for a if/then pattern (1 -> 2 -> 3 & 1 -> 3)
group_ifthen = lambda { |strict|
groups.reverse.find { |g|
next if not g2 = g.to.find { |g2_| (g2_.to.length == 1 and g.to.include?(g2_.to.first)) or
(not strict and g2_.to.empty?) }
next if strict and g2.from != [g] or g.to.length != 2
g2.h += 16 ; g2.y -= 8
align_vt[[g, g2]]
dx = -g2.x+8
dx -= g2.w+16 if group_inv_if[g]
move_group[g2, dx, 0]
merge_groups[[g, g2]]
true
}
}
# if (a || b) c;
# the 'else' case handles '&& else', and && is two if/then nested
group_or = lambda { |strict|
groups.find { |g|
next if g.to.length != 2
g2 = g.to[0]
g2 = g.to[1] if not g2.to.include? g.to[1]
thn = (g.to & g2.to).first
next if g2.to.length != 2 or not thn or thn.to.length != 1
els = (g2.to - [thn]).first
if thn.to == [els]
els = nil
elsif els.to != thn.to
next if strict
align_vt[[g, g2]]
merge_groups[[g, g2]]
break true
else
align_hz[[thn, els]]
thn = merge_groups[[thn, els]]
end
thn.h += 16 ; thn.y -= 8
align_vt[[g, g2, thn]]
move_group[g2, -g2.x, 0]
move_group[thn, thn.x-8, 0] if not els
merge_groups[[g, g2, thn]]
true
}
}
# loop with exit 1 -> 2, 3 & 2 -> 1
group_loop = lambda {
groups.find { |g|
next if not g2 = g.to.sort_by { |g2_| g2_.h }.find { |g2_| g2_.to == [g] or (g2_.to.empty? and g2_.from == [g]) }
g2.h += 16
align_vt[[g, g2]]
move_group[g2, g2.x-8, 0]
merge_groups[[g, g2]]
true
}
}
# same single from or to
group_halflines = lambda {
ary = nil
if groups.find { |g| ary = g.from.find_all { |gg| gg.to == [g] } and ary.length > 1 } or
groups.find { |g| ary = g.to.find_all { |gg| gg.from == [g] } and ary.length > 1 }
align_hz[ary]
merge_groups[ary]
true
end
}
# unknown pattern, group as we can..
group_other = lambda {
puts 'graph arrange: unknown configuration', dump_layout
g1 = groups.find_all { |g| g.from.empty? }
g1 << groups[rand(groups.length)] if g1.empty?
g2 = g1.map { |g| g.to }.flatten.uniq - g1
align_vt[g1]
g1 = merge_groups[g1]
g1.w += 128 ; g1.x -= 64
next if g2.empty?
align_vt[g2]
g2 = merge_groups[g2]
g2.w += 128 ; g2.x -= 64
align_hz[[g1, g2]]
merge_groups[[g1, g2]]
true
}
# check constructs with multiple blocks with to to end block (a la break;)
ign_break = lambda {
can_reach = lambda { |b1, b2, term|
next if b1 == term
done = [term]
todo = b1.to.dup
while t = todo.pop
next if done.include? t
done << t
break true if t == b2
todo.concat t.to
end
}
can_reach_unidir = lambda { |b1, b2, term| can_reach[b1, b2, term] and not can_reach[b2, b1, term] }
groups.find { |g|
f2 = nil
if (g.from.length > 2 and f3 = g.from.find { |f| f.to == [g] } and f1 = g.from.find { |f|
f2 = g.from.find { |ff| can_reach_unidir[ff, f3, g] and can_reach_unidir[f, ff, g] }}) or
(g.to.length > 2 and f3 = g.to.find { |f| f.from == [g] } and f1 = g.to.find { |f|
f2 = g.to.find { |ff| can_reach_unidir[f3, ff, g] and can_reach_unidir[ff, f, g] }})
group_inv_if[f1] = true
if f3.to == [g]
g.from.delete f2
f2.to.delete g
else
g.to.delete f2
f2.from.delete g
end
true
end
}
}
# walk graph from roots, cut backward links
trim_graph = lambda {
next true if ign_break[]
g1 = groups.find_all { |g| g.from.empty? }
g1 << groups.first if g1.empty?
cntpre = groups.inject(0) { |cntpre_, g| cntpre_ + g.to.length }
g1.each { |g| maketree[[g]] }
cntpost = groups.inject(0) { |cntpre_, g| cntpre_ + g.to.length }
true if cntpre != cntpost
}
# known, clean patterns
group_clean = lambda {
group_columns[] or group_lines[true] or group_ifthen[true] or group_loop[] or group_or[true]
}
# approximations
group_unclean = lambda {
group_lines[false] or group_or[false] or group_halflines[] or group_ifthen[false] or group_other[]
}
group_clean[] or trim_graph[] or group_unclean[]
end
# the boxes have been almost put in place, here we soften a little the result & arrange some qwirks
def auto_arrange_post
# entrypoint should be above other boxes, same for exitpoints
@box.each { |b|
if b.from == []
chld = b.to
chld = @box - [b] if not @box.find { |bb| bb != b and bb.from == [] }
chld.each { |t| b.y = t.y - b.h - 16 if t.y < b.y+b.h }
end
if b.to == []
chld = b.from
chld = @box - [b] if not @box.find { |bb| bb != b and bb.to == [] }
chld.each { |f| b.y = f.y + f.h + 16 if f.y+f.h > b.y }
end
}
boxxy = @box.sort_by { |bb| bb.y }
# fill gaps that we created
@box.each { |b|
bottom = b.y+b.h
next if not follower = boxxy.find { |bb| bb.y+bb.h > bottom }
# preserve line[] constructs margins
gap = follower.y-16*follower.from.length - (bottom+16*b.to.length)
next if gap <= 0
@box.each { |bb|
if bb.y+bb.h <= bottom
bb.y += gap/2
else
bb.y -= gap/2
end
}
boxxy = @box.sort_by { |bb| bb.y }
}
@box[0,0].each { |b|
# TODO elastic positionning (ignore up arrows ?) & collision detection (box/box + box/arrow)
f = b.from[0]
t = b.to[0]
if b.to.length == 1 and b.from.length == 1 and b.y+b.h<t.y and b.y>f.y+f.h
wx = (t.x+t.w/2 + f.x+f.w/2)/2 - b.w/2
wy = (t.y + f.y+f.h)/2 - b.h/2
b.x += (wx-b.x)/5
b.y += (wy-b.y)/5
end
}
end
def auto_arrange_boxes
auto_arrange_init
nil while @groups.length > 1 and auto_arrange_step
@groups = []
auto_arrange_post
end
end
@@ -758,27 +488,23 @@ class GraphViewWidget < DrawableWidget
@shown_boxes = []
@mousemove_origin = @mousemove_origin_ctrl = nil
@curcontext = Graph.new(nil)
@want_focus_addr = nil
@margin = 8
@zoom = 1.0
@default_color_association = ColorTheme.merge :hlbox_bg => :palegrey, :box_bg => :white,
:arrow_hl => :red, :arrow_cond => :darkgreen, :arrow_uncond => :darkblue,
:arrow_direct => :darkred, :box_bg_shadow => :black, :background => :paleblue
@default_color_association = { :background => :paleblue, :hlbox_bg => :palegrey, :box_bg => :white,
:text => :black, :arrow_hl => :red, :comment => :darkblue, :address => :darkblue,
:instruction => :black, :label => :darkgreen, :caret => :black, :hl_word => :palered,
:cursorline_bg => :paleyellow, :arrow_cond => :darkgreen, :arrow_uncond => :darkblue,
:arrow_direct => :darkred }
# @othergraphs = ? (to keep user-specified formatting)
end
def view_x; @curcontext.view_x; end
def view_x=(vx); @curcontext.view_x = vx; end
def view_y; @curcontext.view_y; end
def view_y=(vy); @curcontext.view_y = vy; end
def resized(w, h)
redraw
end
def find_box_xy(x, y)
x = view_x+x/@zoom
y = view_y+y/@zoom
x = @curcontext.view_x+x/@zoom
y = @curcontext.view_y+y/@zoom
@shown_boxes.to_a.reverse.find { |b| b.x <= x and b.x+b.w > x and b.y <= y-1 and b.y+b.h > y+1 }
end
@@ -786,7 +512,6 @@ class GraphViewWidget < DrawableWidget
case dir
when :up
if @zoom < 100
# zoom in
oldzoom = @zoom
@zoom *= 1.1
@zoom = 1.0 if (@zoom-1.0).abs < 0.05
@@ -794,8 +519,7 @@ class GraphViewWidget < DrawableWidget
@curcontext.view_y += (y / oldzoom - y / @zoom)
end
when :down
if @zoom > 1.0/1000
# zoom out
if @zoom > 1.0/100
oldzoom = @zoom
@zoom /= 1.1
@zoom = 1.0 if (@zoom-1.0).abs < 0.05
@@ -833,10 +557,10 @@ class GraphViewWidget < DrawableWidget
@mousemove_origin = nil
if @mousemove_origin_ctrl
x1 = view_x + @mousemove_origin_ctrl[0]/@zoom
x1 = @curcontext.view_x + @mousemove_origin_ctrl[0]/@zoom
x2 = x1 + (x - @mousemove_origin_ctrl[0])/@zoom
x1, x2 = x2, x1 if x1 > x2
y1 = view_y + @mousemove_origin_ctrl[1]/@zoom
y1 = @curcontext.view_y + @mousemove_origin_ctrl[1]/@zoom
y2 = y1 + (y - @mousemove_origin_ctrl[1])/@zoom
y1, y2 = y2, y1 if y1 > y2
@selected_boxes |= @curcontext.box.find_all { |b| b.x >= x1 and b.x + b.w <= x2 and b.y >= y1 and b.y + b.h <= y2 }
@@ -863,8 +587,8 @@ class GraphViewWidget < DrawableWidget
if b = find_box_xy(x, y)
@selected_boxes = [b] if not @selected_boxes.include? b
@caret_box = b
@caret_x = (view_x+x/@zoom-b.x-1).to_i / @font_width
@caret_y = (view_y+y/@zoom-b.y-1).to_i / @font_height
@caret_x = (@curcontext.view_x+x/@zoom-b.x-1).to_i / @font_width
@caret_y = (@curcontext.view_y+y/@zoom-b.y-1).to_i / @font_height
update_caret
else
@selected_boxes = []
@@ -873,43 +597,22 @@ class GraphViewWidget < DrawableWidget
redraw
end
def setup_contextmenu(b, m)
cm = new_menu
addsubmenu(cm, 'copy _word') { clipboard_copy(@hl_word) if @hl_word }
addsubmenu(cm, 'copy _line') { clipboard_copy(@caret_box[:line_text_col][@caret_y].map { |ss, cc| ss }.join) }
addsubmenu(cm, 'copy _box') {
sb = @selected_boxes
sb = [@curbox] if sb.empty?
clipboard_copy(sb.map { |ob| ob[:line_text_col].map { |s| s.map { |ss, cc| ss }.join + "\r\n" }.join }.join("\r\n"))
} # XXX auto \r\n vs \n
addsubmenu(m, '_clipboard', cm)
addsubmenu(m, 'clone _window') { @parent_widget.clone_window(@hl_word, :graph) }
addsubmenu(m, 'show descendants only') { hide_non_descendants(@selected_boxes) }
addsubmenu(m, 'show ascendants only') { hide_non_ascendants(@selected_boxes) }
addsubmenu(m, 'restore graph') { gui_update }
end
# if the target is a call to a subfunction, open a new window with the graph of this function (popup)
def rightclick(x, y)
if b = find_box_xy(x, y) and @zoom >= 0.90 and @zoom <= 1.1
click(x, y)
@mousemove_origin = nil
m = new_menu
setup_contextmenu(b, m)
if @parent_widget.respond_to?(:extend_contextmenu)
@parent_widget.extend_contextmenu(self, m, @caret_box[:line_address][@caret_y])
end
popupmenu(m, x, y)
@parent_widget.clone_window(@hl_word, :graph)
end
end
def doubleclick(x, y)
@mousemove_origin = nil
if b = find_box_xy(x, y)
@mousemove_origin = nil
if @hl_word and @zoom >= 0.90 and @zoom <= 1.1
@parent_widget.focus_addr(@hl_word)
else
@parent_widget.focus_addr((b[:addresses] || b[:line_address]).first)
@parent_widget.focus_addr b[:addresses].first
end
elsif doubleclick_check_arrow(x, y)
elsif @zoom == 1.0
@@ -925,21 +628,20 @@ class GraphViewWidget < DrawableWidget
# check if the user clicked on the beginning/end of an arrow, if so focus on the other end
def doubleclick_check_arrow(x, y)
return if @margin*@zoom < 2
x = view_x+x/@zoom
y = view_y+y/@zoom
x = @curcontext.view_x+x/@zoom
y = @curcontext.view_y+y/@zoom
sx = nil
if bt = @shown_boxes.to_a.reverse.find { |b|
y >= b.y+b.h-1 and y <= b.y+b.h-1+@margin+2 and
sx = b.x+b.w/2 - b.to.length/2 * @margin/2 and
x >= sx-@margin/2 and x <= sx+b.to.length*@margin/2 # should be margin/4, but add a little comfort margin
x >= sx-@margin/2 and x <= sx+b.to.length*@margin/2 # should be margin/4, but add a little comfort margin
}
idx = (x-sx+@margin/4).to_i / (@margin/2)
idx = 0 if idx < 0
idx = bt.to.length-1 if idx >= bt.to.length
if bt.to[idx]
if @parent_widget
@caret_box, @caret_y = bt, bt[:line_address].length-1
@parent_widget.focus_addr bt.to[idx][:line_address][0]
@parent_widget.focus_addr bt.to[idx][:line_address][0]
else
focus_xy(bt.to[idx].x, bt.to[idx].y)
end
@@ -948,15 +650,14 @@ class GraphViewWidget < DrawableWidget
elsif bf = @shown_boxes.to_a.reverse.find { |b|
y >= b.y-@margin-2 and y <= b.y and
sx = b.x+b.w/2 - b.from.length/2 * @margin/2 and
x >= sx-@margin/2 and x <= sx+b.from.length*@margin/2
x >= sx-@margin/2 and x <= sx+b.from.length*@margin/2
}
idx = (x-sx+@margin/4).to_i / (@margin/2)
idx = 0 if idx < 0
idx = bf.from.length-1 if idx >= bf.from.length
if bf.from[idx]
if @parent_widget
@caret_box, @caret_y = bf, bf[:line_address].length-1
@parent_widget.focus_addr bf.from[idx][:line_address][-1]
@parent_widget.focus_addr bf.from[idx][:line_address][-1]
else
focus_xy(bt.from[idx].x, bt.from[idx].y)
end
@@ -967,12 +668,10 @@ class GraphViewWidget < DrawableWidget
# update the zoom & view_xy to show the whole graph in the window
def zoom_all
minx, miny, maxx, maxy = @curcontext.boundingbox
minx -= @margin
miny -= @margin
maxx += @margin
maxy += @margin
minx = @curcontext.box.map { |b| b.x }.min.to_i - 10
miny = @curcontext.box.map { |b| b.y }.min.to_i - 10
maxx = @curcontext.box.map { |b| b.x + b.w }.max.to_i + 10
maxy = @curcontext.box.map { |b| b.y + b.h }.max.to_i + 10
@zoom = [width.to_f/(maxx-minx), height.to_f/(maxy-miny)].min
@zoom = 1.0 if @zoom > 1.0 or (@zoom-1.0).abs < 0.1
@curcontext.view_x = minx + (maxx-minx-width/@zoom)/2
@@ -1000,10 +699,9 @@ class GraphViewWidget < DrawableWidget
}
@shown_boxes = []
w_w = width
w_h = height
w_w, w_h = width, height
@curcontext.box.each { |b|
next if b.x >= view_x+w_w/@zoom or b.y >= view_y+w_h/@zoom or b.x+b.w <= view_x or b.y+b.h <= view_y
next if b.x >= @curcontext.view_x+w_w/@zoom or b.y >= @curcontext.view_y+w_h/@zoom or b.x+b.w <= @curcontext.view_x or b.y+b.h <= @curcontext.view_y
@shown_boxes << b
paint_box(b)
}
@@ -1022,10 +720,9 @@ class GraphViewWidget < DrawableWidget
end
def paint_arrow(b1, b2)
x1 = x1o = b1.x+b1.w/2-view_x
y1 = b1.y+b1.h-view_y
x2 = x2o = b2.x+b2.w/2-view_x
y2 = b2.y-1-view_y
x1, y1 = b1.x+b1.w/2-@curcontext.view_x, b1.y+b1.h-@curcontext.view_y
x2, y2 = b2.x+b2.w/2-@curcontext.view_x, b2.y-1-@curcontext.view_y
x1o, x2o = x1, x2
margin = @margin
x1 += (-(b1.to.length-1)/2 + b1.to.index(b2)) * margin/2
x2 += (-(b2.from.length-1)/2 + b2.from.index(b1)) * margin/2
@@ -1033,6 +730,12 @@ class GraphViewWidget < DrawableWidget
margin, x1, y1, x2, y2, b1w, b2w, x1o, x2o = [margin, x1, y1, x2, y2, b1.w, b2.w, x1o, x2o].map { |v| v*@zoom }
# XXX gtk wraps coords around 0x8000
if x1.abs > 0x7000 ; y1 /= x1.abs/0x7000 ; x1 /= x1.abs/0x7000 ; end
if y1.abs > 0x7000 ; x1 /= y1.abs/0x7000 ; y1 /= y1.abs/0x7000 ; end
if x2.abs > 0x7000 ; y2 /= x2.abs/0x7000 ; x2 /= x2.abs/0x7000 ; end
if y2.abs > 0x7000 ; x2 /= y2.abs/0x7000 ; y2 /= y2.abs/0x7000 ; end
# straighten vertical arrows if possible
if y2 > y1 and (x1-x2).abs <= margin
if b1.to.length == 1
@@ -1052,13 +755,26 @@ class GraphViewWidget < DrawableWidget
y1 += margin
y2 -= margin-1
end
if y2 > y1 - b1.h*@zoom - 2*margin+1
# straight arrow
if y2+margin >= y1-margin-1
# straight vertical down arrow
draw_line(x1, y1, x2, y2) if x1 != y1 or x2 != y2
else
# arrow goes up: navigate around b2
# else arrow up, need to sneak around boxes
elsif x1o-b1w/2-margin >= x2o+b2w/2+margin # z
draw_line(x1, y1, x1o-b1w/2-margin, y1)
draw_line(x1o-b1w/2-margin, y1, x2o+b2w/2+margin, y2)
draw_line(x2o+b2w/2+margin, y2, x2, y2)
draw_line(x1, y1+1, x1o-b1w/2-margin, y1+1) # double
draw_line(x1o-b1w/2-margin+1, y1, x2o+b2w/2+margin+1, y2)
draw_line(x2o+b2w/2+margin, y2+1, x2, y2+1)
elsif x1+b1w/2+margin <= x2-b2w/2-margin # invert z
draw_line(x1, y1, x1o+b1w/2+margin, y1)
draw_line(x1o+b1w/2+margin, y1, x2o-b2w/2-margin, y2)
draw_line(x2o-b2w/2-margin, y2, x2, y2)
draw_line(x1, y1+1, x1+b1w/2+margin, y1+1) # double
draw_line(x1o+b1w/2+margin+1, y1, x2o-b2w/2-margin+1, y2)
draw_line(x2o-b2w/2-margin, y2+1, x2, y2+1)
else # turn around
x = (x1 <= x2 ? [x1o-b1w/2-margin, x2o-b2w/2-margin].min : [x1o+b1w/2+margin, x2o+b2w/2+margin].max)
draw_line(x1, y1, x, y1)
draw_line(x, y1, x, y2)
@@ -1070,7 +786,7 @@ class GraphViewWidget < DrawableWidget
end
def set_color_boxshadow(b)
draw_color :box_bg_shadow
draw_color :black
end
def set_color_box(b)
@@ -1083,29 +799,28 @@ class GraphViewWidget < DrawableWidget
def paint_box(b)
set_color_boxshadow(b)
draw_rectangle((b.x-view_x+3)*@zoom, (b.y-view_y+4)*@zoom, b.w*@zoom, b.h*@zoom)
draw_rectangle((b.x-@curcontext.view_x+3)*@zoom, (b.y-@curcontext.view_y+4)*@zoom, b.w*@zoom, b.h*@zoom)
set_color_box(b)
draw_rectangle((b.x-view_x)*@zoom, (b.y-view_y+1)*@zoom, b.w*@zoom, b.h*@zoom)
draw_rectangle((b.x-@curcontext.view_x)*@zoom, (b.y-@curcontext.view_y+1)*@zoom, b.w*@zoom, b.h*@zoom)
# current text position
x = (b.x - view_x + 1)*@zoom
y = (b.y - view_y + 1)*@zoom
w_w = (b.x - view_x + b.w - @font_width)*@zoom
w_h = (b.y - view_y + b.h - @font_height)*@zoom
w_h = height if w_h > height
x = (b.x - @curcontext.view_x + 1)*@zoom
y = (b.y - @curcontext.view_y + 1)*@zoom
w_w = (b.x - @curcontext.view_x + b.w - @font_width)*@zoom
w_h = (b.y - @curcontext.view_y + b.h - @font_height)*@zoom
if @parent_widget and @parent_widget.bg_color_callback
ly = 0
b[:line_address].each { |a|
if c = @parent_widget.bg_color_callback[a]
draw_rectangle_color(c, (b.x-view_x)*@zoom, (1+b.y-view_y+ly*@font_height)*@zoom, b.w*@zoom, (@font_height*@zoom).ceil)
draw_rectangle_color(c, (b.x-@curcontext.view_x)*@zoom, (1+b.y-@curcontext.view_y+ly*@font_height)*@zoom, b.w*@zoom, (@font_height*@zoom).ceil)
end
ly += 1
}
end
if @caret_box == b
draw_rectangle_color(:cursorline_bg, (b.x-view_x)*@zoom, (1+b.y-view_y+@caret_y*@font_height)*@zoom, b.w*@zoom, @font_height*@zoom)
draw_rectangle_color(:cursorline_bg, (b.x-@curcontext.view_x)*@zoom, (1+b.y-@curcontext.view_y+@caret_y*@font_height)*@zoom, b.w*@zoom, @font_height*@zoom)
end
return if @zoom < 0.99 or @zoom > 1.1
@@ -1114,22 +829,33 @@ class GraphViewWidget < DrawableWidget
# renders a string at current cursor position with a color
# must not include newline
render = lambda { |str, color|
# function ends when we write under the bottom of the listing
next if y >= w_h+2 or x >= w_w
draw_string_hl(color, x, y, str)
if @hl_word
stmp = str
pre_x = 0
while stmp =~ /^(.*?)(\b#{Regexp.escape @hl_word}\b)/
s1, s2 = $1, $2
pre_x += s1.length * @font_width
hl_x = s2.length * @font_width
draw_rectangle_color(:hl_word, x+pre_x, y, hl_x, @font_height*@zoom)
pre_x += hl_x
stmp = stmp[s1.length+s2.length..-1]
end
end
draw_string_color(color, x, y, str)
x += str.length * @font_width
}
yoff = @font_height * @zoom
b[:line_text_col].each { |list|
list.each { |s, c| render[s, c] } if y >= -yoff
x = (b.x - view_x + 1)*@zoom
y += yoff
break if y > w_h+2
list.each { |s, c| render[s, c] }
x = (b.x - @curcontext.view_x + 1)*@zoom
y += @font_height*@zoom
}
if b == @caret_box and focus?
cx = (b.x - view_x + 1 + @caret_x*@font_width)*@zoom
cy = (b.y - view_y + 1 + @caret_y*@font_height)*@zoom
cx = (b.x - @curcontext.view_x + 1 + @caret_x*@font_width)*@zoom
cy = (b.y - @curcontext.view_y + 1 + @caret_y*@font_height)*@zoom
draw_line_color(:caret, cx, cy, cx, cy+(@font_height-1)*@zoom)
end
end
@@ -1168,33 +894,33 @@ class GraphViewWidget < DrawableWidget
end
def load_dotfile(path)
load_dot(File.read(path))
end
def load_dot(dota)
@want_update_graph = false
@curcontext.clear
boxes = {}
new_box = lambda { |text|
b = @curcontext.new_box(text, :line_text_col => [[[text, :text]]])
b.w = (text.length+1) * @font_width
b.w = text.length * @font_width
b.h = @font_height
b
}
dota.scan(/^.*$/) { |l|
a = l.strip.chomp(';').split(/->/).map { |s| s.strip.delete '"' }
next if not id = a.shift
b0 = boxes[id] ||= new_box[id]
while id = a.shift
b1 = boxes[id] ||= new_box[id]
b0.to |= [b1]
b1.from |= [b0]
b0 = b1
max = File.size(path)
i = 0
File.open(path) { |fd|
while l = fd.gets
case l.strip
when /^"?(\w+)"?\s*->\s*"?(\w+)"?;?$/
b1 = boxes[$1] ||= new_box[$1]
b2 = boxes[$2] ||= new_box[$2]
b1.to |= [b2]
b2.from |= [b1]
end
$stderr.printf("%.02f\r" % (fd.pos*100.0/max)) if (i += 1) & 0xff == 0
end
}
p boxes.length
redraw
rescue Interrupt
puts "dot_len #{boxes.length}"
rescue Interrupt
p boxes.length
end
# create the graph objects in ctx
@@ -1283,7 +1009,7 @@ class GraphViewWidget < DrawableWidget
}
end
render["#{Expression[curaddr]} ", :address] if @show_addresses
render[di.instruction.to_s.ljust(di.comment ? 18 : 0), :instruction]
render[di.instruction.to_s.ljust(di.comment ? 24 : 0), :instruction]
render[' ; ' + di.comment.join(' ')[0, 64], :comment] if di.comment
nl[]
else
@@ -1316,8 +1042,6 @@ class GraphViewWidget < DrawableWidget
}
@parent_widget.list_bghilight("search result for /#{pat}/i", list) { |i| @parent_widget.focus_addr i[0] }
}
when ?+; mouse_wheel_ctrl(:up, width/2, height/2)
when ?-; mouse_wheel_ctrl(:down, width/2, height/2)
else return false
end
true
@@ -1411,18 +1135,16 @@ class GraphViewWidget < DrawableWidget
end
when :pgup
if @caret_box
@caret_y -= (height/4/@zoom/@font_height).to_i
@caret_y = 0 if @caret_y < 0
update_caret(false)
@caret_y = 0
update_caret
else
@curcontext.view_y -= height/4/@zoom
redraw
end
when :pgdown
if @caret_box
@caret_y += (height/4/@zoom/@font_height).to_i
@caret_y = [@caret_box[:line_address].length-1, @caret_y].min
update_caret(false)
@caret_y = @caret_box[:line_address].length-1
update_caret
else
@curcontext.view_y += height/4/@zoom
redraw
@@ -1430,7 +1152,7 @@ class GraphViewWidget < DrawableWidget
when :home
if @caret_box
@caret_x = 0
update_caret(false)
update_caret
else
@curcontext.view_x = @curcontext.box.map { |b_| b_.x }.min-10
@curcontext.view_y = @curcontext.box.map { |b_| b_.y }.min-10
@@ -1439,7 +1161,7 @@ class GraphViewWidget < DrawableWidget
when :end
if @caret_box
@caret_x = @caret_box[:line_text_col][@caret_y].to_a.map { |ss, cc| ss }.join.length
update_caret(false)
update_caret
else
@curcontext.view_x = [@curcontext.box.map { |b_| b_.x+b_.w }.max-width/@zoom+10, @curcontext.box.map { |b_| b_.x }.min-10].max
@curcontext.view_y = [@curcontext.box.map { |b_| b_.y+b_.h }.max-height/@zoom+10, @curcontext.box.map { |b_| b_.y }.min-10].max
@@ -1453,24 +1175,38 @@ class GraphViewWidget < DrawableWidget
b_.to.each { |bb| bb.from.delete b_ }
}
redraw
when :popupmenu
if @caret_box
cx = (@caret_box.x - view_x + 1 + @caret_x*@font_width)*@zoom
cy = (@caret_box.y - view_y + 1 + @caret_y*@font_height)*@zoom
rightclick(cx, cy)
end
when ?a
t0 = Time.now
puts 'autoarrange'
@curcontext.auto_arrange_boxes
redraw
puts 'autoarrange done %.02f' % (Time.now - t0)
puts 'autoarrange done'
when ?u
gui_update
when ?R
load __FILE__
when ?S # reset
@curcontext.auto_arrange_init(@selected_boxes.empty? ? @curcontext.box : @selected_boxes)
puts 'reset', @curcontext.dump_layout, ''
zoom_all
redraw
when ?T # step auto_arrange
@curcontext.auto_arrange_step
puts @curcontext.dump_layout, ''
zoom_all
redraw
when ?L # post auto_arrange
@curcontext.auto_arrange_post
zoom_all
redraw
when ?V # shrink
@selected_boxes.each { |b_|
dx = (b_.from + b_.to).map { |bb| bb.x+bb.w/2 - b_.x-b_.w/2 }
dx = dx.inject(0) { |s, xx| s+xx }/dx.length
b_.x += dx
}
redraw
when ?I # create arbitrary boxes/links
if @selected_boxes.empty?
@fakebox ||= 0
@@ -1481,7 +1217,7 @@ class GraphViewWidget < DrawableWidget
b.h = @font_height * 2
b.x = rand(200) - 100
b.y = rand(200) - 100
@fakebox += 1
else
b1, *bl = @selected_boxes
@@ -1493,8 +1229,8 @@ class GraphViewWidget < DrawableWidget
else
b1.to << b2
b2.from << b1
end
}
end
}
end
redraw
@@ -1520,48 +1256,6 @@ class GraphViewWidget < DrawableWidget
true
end
def hide_non_descendants(list)
reach = {}
todo = list.dup
while b = todo.pop
next if reach[b]
reach[b] = true
b.to.each { |bb|
todo << bb if bb.y+bb.h >= b.y
}
end
@curcontext.box.delete_if { |bb|
!reach[bb]
}
@curcontext.box.each { |bb|
bb.from.delete_if { |bbb| !reach[bbb] }
bb.to.delete_if { |bbb| !reach[bbb] }
}
redraw
end
def hide_non_ascendants(list)
reach = {}
todo = list.dup
while b = todo.pop
next if reach[b]
reach[b] = true
b.from.each { |bb|
todo << bb if bb.y <= b.h+b.y
}
end
@curcontext.box.delete_if { |bb|
!reach[bb]
}
@curcontext.box.each { |bb|
bb.from.delete_if { |bbb| !reach[bbb] }
bb.to.delete_if { |bbb| !reach[bbb] }
}
redraw
end
# find a suitable array of graph roots, walking up from a block (function start/entrypoint)
def dasm_find_roots(addr)
todo = [addr]
@@ -1618,6 +1312,7 @@ class GraphViewWidget < DrawableWidget
@curcontext.view_y += (height/2 / @zoom - height/2)
@zoom = 1.0
focus_xy(b.x, b.y + @caret_y*@font_height)
update_caret
elsif can_update_context
@curcontext = Graph.new 'testic'
@@ -1631,59 +1326,23 @@ class GraphViewWidget < DrawableWidget
end
def focus_xy(x, y)
# dont move during a click
return if @mousemove_origin
# ensure the caret stays onscreen
if not view_x
@curcontext.view_x = x - width/5/@zoom
redraw
elsif @caret_box and @caret_box.w < width*27/30/@zoom
# keep @caret_box full if possible
if view_x + width/20/@zoom > @caret_box.x
@curcontext.view_x = @caret_box.x-width/20/@zoom
elsif view_x + width*9/10/@zoom < @caret_box.x+@caret_box.w
@curcontext.view_x = @caret_box.x+@caret_box.w-width*9/10/@zoom
end
elsif view_x + width/20/@zoom > x
@curcontext.view_x = x-width/20/@zoom
redraw
elsif view_x + width*9/10/@zoom < x
@curcontext.view_x = x-width*9/10/@zoom
if not @curcontext.view_x or @curcontext.view_x*@zoom + width*3/4 < x or @curcontext.view_x*@zoom > x
@curcontext.view_x = (x - width/5)/@zoom
redraw
end
if not view_y
@curcontext.view_y = y - height/5/@zoom
redraw
elsif @caret_box and @caret_box.h < height*27/30/@zoom
if view_y + height/20/@zoom > @caret_box.y
@curcontext.view_y = @caret_box.y-height/20/@zoom
elsif view_y + height*9/10/@zoom < @caret_box.y+@caret_box.h
@curcontext.view_y = @caret_box.y+@caret_box.h-height*9/10/@zoom
end
elsif view_y + height/20/@zoom > y
@curcontext.view_y = y-height/20/@zoom
redraw
elsif view_y + height*9/10/@zoom < y
@curcontext.view_y = y-height*9/10/@zoom
if not @curcontext.view_y or @curcontext.view_y*@zoom + height*3/4 < y or @curcontext.view_y*@zoom > y
@curcontext.view_y = (y - height/5)/@zoom
redraw
end
end
# hint that the caret moved
# redraw, change the hilighted word
def update_caret(update_hlword = true)
return if not b = @caret_box or not @caret_x or not l = @caret_box[:line_text_col][@caret_y]
if update_hlword
l = l.map { |s, c| s }.join
@parent_widget.focus_changed_callback[] if @parent_widget and @parent_widget.focus_changed_callback and @oldcaret_y != @caret_y
update_hl_word(l, @caret_x)
end
focus_xy(b.x + @caret_x*@font_width, b.y + @caret_y*@font_height)
def update_caret
return if not @caret_box or not @caret_x or not l = @caret_box[:line_text_col][@caret_y]
l = l.map { |s, c| s }.join
@parent_widget.focus_changed_callback[] if @parent_widget and @parent_widget.focus_changed_callback and @oldcaret_y != @caret_y
update_hl_word(l, @caret_x)
redraw
end

Some files were not shown because too many files have changed in this diff Show More