ROSE  0.11.96.11
taintAnalysis.h
1 #include <featureTests.h>
2 #ifdef ROSE_ENABLE_SOURCE_ANALYSIS
3 
4 #ifndef ROSE_TaintAnalysis_H
5 #define ROSE_TaintAnalysis_H
6 
8 // Tainted flow analysis.
9 //
10 // The original version of this tainted flow analysis was written 2012-09 by someone other than the author of the
11 // genericDataflow framework. It is based on the sign analysis (sgnAnalysis.[Ch]) in this same directory since documentation
12 // for the genericDataflow framework is fairly sparse: 5 pages in the tutorial, not counting the code listings) and no doxygen
13 // documentation.
14 //
15 // This file contains two types of comments:
16 // 1. Comments that try to document some of the things I've discovered through playing with the genericDataflow framework.
17 // 2. Comments and suggestions about usability, consistency, applicability to binary analysis, etc.
18 //
19 // [RPM 2012-09]
21 
22 
23 // USABILITY: Names of header files aren't consistent across the genericDataflow files. E.g., in the "lattice" directory we
24 // have "lattice.h" that defines the Lattice class, but "ConstrGraph.h" that defines "ConstrGraph" (and apparently no
25 // documentation as to what "Constr" means).
26 #include "lattice.h"
27 #include "dataflow.h"
28 #include "liveDeadVarAnalysis.h" // misspelled? Shouldn't it be liveDeadVarsAnalysis or LiveDeadVarsAnalysis?
29 
30 
31 // USABILITY: The abundant use of dynamic_cast makes it seem like something's wrong with the whole dataflow design. And I
32 // couldn't find any documentation about when it's okay to cast from Lattice to one of its subclasses, so I've made
33 // the assumption throughout that when the dynamic_cast returns null, the node in question points to a variable or
34 // expression that the live/dead analysis has determined to be dead.
35 //
36 // USABILITY: No doxygen comments throughout genericDataflow framework?!? But it looks like there's some doxygen-like stuff
37 // describing a few function parameters, so is it using some other documenting system? I at least added the headers
38 // to the docs/Rose/rose.cfg file so doxygen picks up the structure.
39 //
40 // USABILITY: The genericDataflow framework always produces files named "index.html", "summary.html", and "detail.html" and
41 // a directory named "dbg_imgs" regardless of any debug settings. These are apprently the result of the Dbg::init()
42 // call made from the main program, and this call is required (omitting it results in segmentation faults). The
43 // file names are constant and therefore one should expect tests to clobber each other's outputs when run in
44 // parallel. The contents of the files cannot be trusted. Furthermore, since these are HTML files, an aborted test
45 // will generate only a partial HTML file which some browsers will choke on.
46 
47 
48 /******************************************************************************************************************************
49  * Taint Lattice
50  ******************************************************************************************************************************/
51 
57 class TaintLattice: public FiniteLattice {
58 public:
59 
63  enum Vertex {
67  // no need for a top since that would imply that the value is tainted. I.e., VERTEX_TAINTED *is* our top.
68  };
69 
70 protected:
71  Vertex vertex;
73 public:
74 
77 
79  virtual void initialize() override {
80  *this = TaintLattice();
81  }
82 
86  Vertex get_vertex() const { return vertex; }
87  bool set_vertex(Vertex v);
92  virtual Lattice *copy() const override {
93  return new TaintLattice(*this);
94  }
95 
96  // USABILITY: The base class defines copy() without a const argument, so we must do the same here.
99  virtual void copy(/*const*/ Lattice *other_) override;
100 
101 
102  // USABILITY: The base class defines '==' with non-const argument and "this", so we must do the same here.
103  // USABILITY: This is not a real equality predicate since it's not reflexive. In other words, (A==B) does not imply (B==A)
104  // for all values of A and B.
106  virtual bool operator==(/*const*/ Lattice *other_) /*const*/ override;
107 
108  // USABILITY: The base class defines str() with non-const "this", so we must do the same here. That means that if we want
109  // to use this functionality from our own methods (that have const "this") we have to distill it out to some
110  // other place.
111  // USABILITY: The "prefix" argument is pointless. Why not just use StringUtility::prefixLines() in the base class rather
112  // than replicate this functionality all over the place?
116  virtual std::string str(/*const*/ std::string /*&*/prefix) /*const*/ override {
117  return prefix + to_string();
118  }
119 
120  // USABILITY: We define this only because of deficiencies with the "str" signature in the base class. Otherwise our
121  // printing method (operator<<) could just use str(). We're trying to avoid evil const_cast.
124  std::string to_string() const;
125 
126  // USABILITY: The base class defines meetUpdate() with a non-const argument, so we must do the same here.
128  virtual bool meetUpdate(/*const*/ Lattice *other_) override;
129 
130  friend std::ostream& operator<<(std::ostream &o, const TaintLattice &lattice);
131 };
132 
133 /******************************************************************************************************************************
134  * Taint Flow Analysis
135  ******************************************************************************************************************************/
136 
138 protected:
139  LiveDeadVarsAnalysis* ldv_analysis;
140  std::ostream *debug;
141 
142 public:
143  // USABILITY: Documentation as to why a live/dead analysis is used in SgnAnalysis would be nice. I tried doing it without
144  // originally to make things simpler, but it seems that the FiniteVarsExprProductLattice depends on it even
145  // though I saw commented out code and comments somewhere(?) that indicated otherwise.
146  TaintAnalysis(LiveDeadVarsAnalysis *ldv_analysis)
147  : ldv_analysis(ldv_analysis), debug(NULL) {}
148 
152  std::ostream *get_debug() const { return debug; }
153  void set_debug(std::ostream *os) { debug = os; }
156  // BINARIES: The "Function" type is a wrapper around SgFunctionDeclaration and the data flow traversals depend on this
157  // fact. Binaries don't have SgFunctionDeclaration nodes (they have SgAsmFunction, which is a bit different).
158  //
159  // NOTE: The "DataflowNode" is just a VirtualCFG::DataflowNode that contains a VirtualCFG::CFGNode pointer and a
160  // "filter". I didn't find any documentation for how "filter" is used.
161  //
162  // USABILITY: The "initLattices" and "initFacts" are not documented. They're apparently only outputs for this function
163  // since they seem to be empty on every call and are not const. They're apparently not parallel arrays since
164  // the examples I was looking at don't push the same number of items into each vector.
165  //
166  // USABILITY: Copied from src/midend/programAnalysis/genericDataflow/simpleAnalyses/sgnAnalysis.C. I'm not sure what
167  // it's doing yet since there's no doxygen documentation for FiniteVarsExprsProductLattice or any of its
168  // members.
169  //
170  // BINARIES: This might not work for binaries because FiniteVarsExprsProductLattice seems to do things in terms of
171  // variables. Variables are typically lacking from binary specimens and most existing binary analysis
172  // describes things in terms of static register names or dynamic memory locations.
175  void genInitState(const Function& func, const DataflowNode& node, const NodeState& state,
176  std::vector<Lattice*>& initLattices, std::vector<NodeFact*>& initFacts);
177 
178  // USABILITY: Not documented in doxygen, so I'm more or less copying from the SgnAnalysis::transfer() method defined in
179  // src/midend/programAnalysis/genericDataflow/sgnAnalysis.C.
187  bool transfer(const Function& func, const DataflowNode& node_, NodeState& state, const std::vector<Lattice*>& dfInfo);
188 
189 protected:
192  static std::string lattice_info(const TaintLattice *lattice) {
193  return lattice ? lattice->to_string() : "dead";
194  }
195 
212  bool magic_tainted(SgNode *node, FiniteVarsExprsProductLattice *prodLat);
213 };
214 
215 #endif
216 #endif
TaintLattice
A pointer to a vertex of the static taint lattice.
Definition: taintAnalysis.h:57
TaintLattice::VERTEX_BOTTOM
@ VERTEX_BOTTOM
No information is known about the value of the variable.
Definition: taintAnalysis.h:68
IntraFWDataflow
Definition: dataflow.h:245
Lattice
Definition: lattice.h:12
TaintLattice::set_vertex
bool set_vertex(Vertex v)
Accessor for this node's vertex in the lattice.
NodeState
Definition: nodeState.h:92
TaintAnalysis::magic_tainted
bool magic_tainted(SgNode *node, FiniteVarsExprsProductLattice *prodLat)
Make certain variables always tainted.
VirtualCFG::DataflowNode
Definition: DataflowCFG.h:19
TaintLattice::operator==
virtual bool operator==(Lattice *other_) override
Equality predicate, sort of.
TaintAnalysis::lattice_info
static std::string lattice_info(const TaintLattice *lattice)
Helps print lattice pointers.
Definition: taintAnalysis.h:194
TaintAnalysis
Definition: taintAnalysis.h:137
Vertex
Definition: SgGraphTemplate.h:7
FiniteVarsExprsProductLattice
Definition: liveDeadVarAnalysis.h:334
TaintLattice::initialize
virtual void initialize() override
Same as default constructor.
Definition: taintAnalysis.h:81
TaintLattice::get_vertex
Vertex get_vertex() const
Accessor for this node's vertex in the lattice.
Definition: taintAnalysis.h:88
TaintAnalysis::transfer
bool transfer(const Function &func, const DataflowNode &node_, NodeState &state, const std::vector< Lattice * > &dfInfo)
Adjust a result vertex pointer.
TaintAnalysis::set_debug
void set_debug(std::ostream *os)
Accessor for debug settings.
Definition: taintAnalysis.h:155
FiniteLattice
Definition: lattice.h:125
SgNode
This class represents the base class for all IR nodes within Sage III.
Definition: Cxx_Grammar.h:6739
TaintLattice::vertex
Vertex vertex
The vertex of the static taint lattice to which this object points.
Definition: taintAnalysis.h:73
TaintLattice::meetUpdate
virtual bool meetUpdate(Lattice *other_) override
Merges this lattice node with another and stores the result in this node.
TaintLattice::TaintLattice
TaintLattice()
Default initializer makes this object point to the lattice's bottom vertex.
Definition: taintAnalysis.h:78
LiveDeadVarsAnalysis
Definition: liveDeadVarAnalysis.h:162
TaintAnalysis::get_debug
std::ostream * get_debug() const
Accessor for debug settings.
Definition: taintAnalysis.h:154
Function
Definition: CallGraphTraverse.h:20
TaintLattice::copy
virtual Lattice * copy() const override
Returns a new copy of this vertex pointer.
Definition: taintAnalysis.h:94
TaintLattice::VERTEX_UNTAINTED
@ VERTEX_UNTAINTED
Value is not tainted.
Definition: taintAnalysis.h:69
TaintAnalysis::genInitState
void genInitState(const Function &func, const DataflowNode &node, const NodeState &state, std::vector< Lattice * > &initLattices, std::vector< NodeFact * > &initFacts)
Generate initial lattice state.
TaintLattice::to_string
std::string to_string() const
String representation of a lattice vertex.
TaintLattice::str
virtual std::string str(std::string prefix) override
String representation of the lattice vertex to which this object points.
Definition: taintAnalysis.h:118
TaintLattice::VERTEX_TAINTED
@ VERTEX_TAINTED
Value is tainted.
Definition: taintAnalysis.h:70