[SGVLUG] Four Tips To Avoid Open Source Legal Problems
Emerson, Tom
Tom.Emerson at wbconsultant.com
Fri Jul 7 15:39:54 PDT 2006
> -----Original Message----- Of Michael B. Parker
>
> [...] leads me to think that it might be possible for a
> produt which MIGHT meaningfully detect a lot of source code
> reuse if some sort of "running strings" were in the original
> source. While GUI code uses running strings often, many
> algorithm code doesn't.
I'm not sure what you think "running strings" means is the same thing as
what Dustin commented on -- "running" in this case, refers to executing
the program (called "strings") with the suspect binary as the input
source for the program.
> ... However, perhaps something could be
> made to insert some identifying hard-to-remove logic into
> SOURCE CODE that would be easy detect (even in a compiled
> binary) and hard to remove even by a programmer with the
> source, unless he/she understood the source well. I don't
> know, maybe it's wishful thinking to create this watermark
> deliberately in source.
Years ago, I got my hands on a "QA" manual for a company that was
rumored to have strong ties to Scientology (more along the lines that
the owners and/or programming staff were proponents of scientology, not
necessarilly that the company publicly supported the cult) In any case,
there is a check-off item entitled "smoking gun installed", and in the
descriptive text it went on to explain that this is code that will never
actually execute AND has a distinctive "signature" when compiled -- this
signature would be copied down and (as I recall...) sent off to some
"escrow" type company (or perhaps the patent office?) to be used in the
event of suspected piracy. [I don't know for certain whether this
"smoking gun installed" policy was simply a good, sound business
practice or just another paranoid scientology tactic...]
In any case, the "smoking gun" code would look perfectly reasonable, say
for instance an error reporting routine called when a check of a
particular variable holds an invalid value, but careful evaluation of
the code would show that this variable would NEVER contain that value
(except, possibly, in the case of an errant pointer overwriting memory,
but then again, this was in COBOL which doesn't exactly have "pointer"
variables...) An "obvious" version would look like this:
===============================
MAIN.
PERFORM ROUTINE-A VARYING WS-GLOBAL-1 FROM 1 BY 1
UNTIL WS-GLOBAL-1 > 10
STOP RUN.
ROUTINE-A.
EVALUATE WS-GLOBAL-1
WHEN 1 PERFORM ROUTINE-B
WHEN 2 PERFORM ROUTINE-C
WHEN 3 PERFORM ROUTINE-D
...
WHEN 10 PERFORM ROUTINE-K.
ROUTINE-B.
* does various things, buried among which is this
ADD 1 TO WS-GLOBAL-1.
* statement...
ROUTINE-C.
DISPLAY "THIS IS THE SMOKING GUN".
================================
[EGAD, is he really trying to teach us COBOL????!!!!]
{yup, but it's not so bad -- it's practically pseudo-code as it is...}
Looking at each segment individually, everything appears OK -- you have
a "for loop" (from 1 to 10); you have a case/select statement with a
case for each possible value, you have 10 routines to be called
depending on the value of the loop's control variable, and so on.
What isn't obvious is that for each pass through "routine-a", there is a
value of "ws-global-1" that will never exist. The compiler won't
optimize this (unless it is a REALLY good optimizing compiler!) and most
"people" won't see this (especially if this is "spaghettified", or
better still the use of a "renames" type variable to essentially refer
to the same physical variable, but with a different name...)
Since paragraph "routine-c" is the target of a perform, the compiler
will produce call/return logic around it, and of course, whatever is
within will likely be "non-optimizable" as well, to guarantee that the
compiler produces "the same object code", even if someone does a global
change to the identifiers (variable names).
More information about the SGVLUG
mailing list