Sophie

Sophie

distrib > Mageia > 4 > x86_64 > by-pkgid > a7fdabb8fb4582be84d8f3c8327ce368 > files > 16

openswan-doc-2.6.39-3.2.mga4.x86_64.rpm


Xelerance has been working on making transport mode work when there are
two clients behind two NAT's that have the same IP. 

It has other benefits as well: we can probably make a transport-mode OE
work through NATs, even when the clients have the same IP addresses. 

That means that the clients do not have to be assigned unique IPs, but
we can't do gateway mode OE.  Maybe that's okay, if we can do OE for
NAT-ed clients. 

Also, BTNS is going towards transport-mode only SAs. 

How does this all work. 

1) we augment ipsecX with "mastX"

  Of the three mastX modes that we discussed years ago, only one has
  been implemented. That is the mode where if the nfmark is set to 
     0x80000000 | (SAref&0x7fff << 16) 

  then we extract the SAref and use it to lookup an SA in the SA table
  that RGB wrote years ago. 

  Some changes to it: the SAref is now a proper extension. I did this
  so that our SA create message was identical to stock PFKEYv2.
  This also makes it easier to #include the KAME/NETKEY
  liunx/pfkeyv2.h. 

  It turns out that NETKEY has skb->sp, which is:

  struct sec_path
  {
	atomic_t		refcnt;
	int			len;
	struct sec_decap_state	x[XFRM_MAX_DEPTH];
  };

  to this, I have added:

  typedef unsigned int xfrm_sec_unique_t;
	xfrm_sec_unique_t       ref;       /*reference to high-level policy*/

  I don't know what the sec_decap_state{} structure does yet, but it
  involves the xfrm code's policy check. We have a much easier "ref" which
  is just an integer. 

  After checking the nfmark, we then check the skb->sp->ref value, and
  if non-null use that instead for the SA lookup.

  The other two modes for mastX are (for the record):
      b) punt all traffic into a well-defined SA# ("PPP" mode).
	 This would be useful for building a virtual-leased line,
	 and would eliminate the GRE layer that is currently required to
	 make BGP-over-IPsec work.

      c) a mode where it has a link-layer value, and for each tunnel
	 that is up, it populates the neighbour cache with something
	 is meaningful to it.

2) in ipsec_rcv(), we set up skb->sp to contain a sec_path that 
   has a ref set up correctly.

3) for UDP sockets, we set up a new IP-level option that permits the
   skb->sp to be mapped to a new "ancilliary" data message. This message
   is currently pretty primitive, consisting of two "ref"s.

   These are called generally "ref" (or refme) and "refhim".
   The first is refme, and is the SA on which the packet arrived.
   The second is refhim, and is the ref of an SA on which a reply can be
   sent.

   For TCP sockets, the TCP layer will be taking care of this stuff.
   (for SCTP, it will have to be a mix!)

   For UDP, the application needs to worry about it.
   Although, I guess connected UDP sockets could not care about it, but
   that's really not that common a use.

4) In the case of xl2tpd, we use the "refhim" as an additional index into
   the call/tunnel list, letting us distinguish two hosts that appear to
   have the same outer IP.
   Actually, we could probably use *ONLY* refhim.

   We don't use "ref", although we do record it the first time, since it
   currently changes during rekeys. I am going to be fixing that now
   that I've figured exactly how.

   The problem is that we have to keep the other (older) SAs around when
   we are doing rekeys, since there may still be packets in flight. In
   fact,  the client could even, say, load balance among the currently
   valid SAs... might occur for some HA system...

   I'm going to be linking the "older" SAs to the newest SA on a singly
   linked list.

   Oh... didn't mention that SAs are now all properly reference counted.
   (I HOPE!)

5) in pluto, we have to be a bit crafty to get things setup right.

   KLIPS can assign SArefs if the passed SAref is IPSEC_SAREF_NULL,
   and this is fine for new SAs. But, we need to know the outgoing SA
   when we create the incoming SA, so that it can reference it.

   Normally, pluto creates the incoming during QUICK_R1 (when it
   receives I1, and sends R1). It then creates the outgoing SA when
   it gets I2.

   Now, if pluto doesn't know the outgoing SA's "refhim", it creates
   the outgoing SA immediately. (It doesn't "eroute" it yet)

   It then creates the incoming SA.

   Then at I2, if it hasn't created the outgoing SA, it creates it.
   Why might that happen? well, at a rekey, we already know the refhim
   that we want to use, so we can just it normally.

   For transport-mode with the MAST kernel driver, we actually don't
   eroute ANYTHING. 

   We can find out the previous refhim by looking at the state
   referenced by:
      st->st_connection->newest_ipsec_sa

   If the gateway is already EXPIRED the state, then we have a
   problem. I don't have a solution for this problem yet. I think that
   we will have to have some kind of mapping:
      {public-key, SA-tuple} => refhim

6) for tunnel mode SAs, we can now use iptables. 

   We do:
  iptables -I PREROUTING 1 -j IPSEC -t mangle
  iptables -I IPSEC 1 -s 192.0.1.0/24 -d 192.0.2.0/24 -j MARK --set-mark 0x80120000

   Except that it turns out iptabels doesn't grok 0x, and thinks of
   set-mark value as being signed... so _updown.mast uses /bin/printf to do
   the right thing

  We then do:
	ip rule add fwmark 0x80000000 fwmarkmask 0x80000000 table 50
	ip route add 0.0.0.0/0 dev $PLUTO_INTERFACE table 50

  fwmarkmask is a new option. It won't stay that way yet. It has kernel
  side code too, which was posted, but needs changes to be accepted.

7) _updown is now a wrapper that basically does:

	exec @IPSEC_LIBDIR@/_updown.${PLUTO_STACK} $*
  
   where PLUTO_STACK={mast,klips,netkey}

8) for transport mode UDP sends, when the IPSEC_REFINFO option is
   attached using sendmsg(), then basically we fill in the outgoing flow
   information. We figure out which mastX device to force things to by
   looking up the ref value given using ipsec_sa_getbyid(), and finding
   the attached mastX device.

   (Oh, yeah, each SA can now indicate which device it should go in or
   out of, but only mast0 is supported by pluto)
	
9) tunnel and OE

   we should be able to do the same trick with iptables with a new module:

   iptables -I IPSEC 1 -s 192.0.1.0/24 -d 192.0.2.0/24 -j IPSEC --saref 0x012
  
   just have to write this module. A trap will have to become

   iptables -I IPSEC 1 -s 192.0.1.0/24 -d 192.0.2.0/24 -j IPSECTRAP --???

   and once pluto sees that, it will have to change it to -j IPSEC.
   the challenge is going to be programatic interface to iptables...


10) OE	

   iptables -I IPSEC 1 -s 192.0.1.0/24 -d 192.0.2.0/24 -j IPSECOE 

   This is different, because it has to use conntrack to look up the
   particulars of the stream, and if there is no conntrack, then it
   has to generate an ACQUIRE. 

   I may use the existing eroute code. Or not. I'm not sure yet.

IRC notes:

16:27:39) mcr: no rekeys yet.
(16:27:56) mcr: you want an explanation.... okay.
(16:28:16) mcr: so, the unique value that the l2tpd uses for the remote client is the "refhim" --- i.e. the outgoing SA.
(16:28:17) ***pat braces himself.
(16:28:45) mcr: l2tpd actually doesn't check the incoming SA anymore. This concerns me a bit, but we need to have a way to ask pluto, "is this SA the same as the previous one?"
(16:29:21) mcr: I was originally going to make the kernel keep the refme the same too. I didn't like my method. I've thought of a better method, and I'll be doing that soon.
(16:29:40) mcr: (we'll need it for TCP, and also for NAT-T friendly OE and NAT-T friendly BTNS...)
(16:30:00) mcr: so, the trick is that pluto tries to keep the refhim the same. It's okay to have only one outgoing SA --- because we get to pick which one.
(16:30:10) mcr: But, we need to keep all incoming SAs until the remote machine deletes them.
(16:30:37) mcr: one of the problems is if the gateway expires the IPsec SA soon --- then we have no record of what the SA was (no refhim=) so, we can't maintain it.
(16:30:44) mcr: okay, so fixed that, and then got a different problem.
(16:31:03) mcr: the client would rekey, and we'd try to keep the SA -- but we screwed up in logic.
(16:31:45) mcr: we have to install the outgoing SA before the incoming SA if we don't know what the outgoing SA is going to be (this is a change compared to what we did before. we would still install the "eroute" after we are sure everything is sane).
(16:31:59) mcr: so, we say, "if refhim==NULL" => install outgoing SA.
(16:32:15) mcr: then, when it is time to install outgoing stuff, we do again, "if refhim==NULL" => install outgoing SA.
(16:32:30) mcr: except that during a rekey, refhim is intentionally NOT NULL. (we keep the value).
(16:32:54) mcr: Since we know the value, we can install the outgoing SA whenever we liike (we already know what to reference in the incoming SA), but because the test was the same, we never
(16:33:19) mcr: installed the new outgoing SA, so we just kept using the old outgoing SA. Well, during a *REAL* rekey (due to expire of SA), the old SA on the client is gone.
(16:33:25) mcr: (vs when you simulate a rekey with --up!)
(16:33:29) mcr: which is what my tests did.
(16:33:50) paul: ahh. bad testcase :)
(16:41:41) pat: operator error :-)