Intercept request -> show custom page -> show original requested page

Asked by Dajan Zvekic on 2011-02-15

Hi,

I assume you can not understand anything from summery of question, so here is the thing.
I want to implement this functionality:

1. User requests page (for example: www.google.hr/ig)
2. Proxy intercepts request and shows custom page. (implemented with ecap adapter)
3. On this custom page I want to have option that user can click the button and continue with his original browsing, so I assume I will have to reconstruct whole original http request.
4. After clicking continue button user is forwarded on its igoogle page with all cookies and everything like interruption never happened.

Question is, can I do this only with adapter? If I can how to do that?
If I can not do it with adapter, what are alternatives? (patching squid to cache original request or something? )

Question information

Language:
English Edit question
Status:
Solved
For:
eCAP Edit question
Assignee:
No assignee Edit question
Solved by:
Dajan Zvekic
Solved:
2011-03-15
Last query:
2011-03-15
Last reply:
2011-02-25

This question was reopened

Alex Rousskov (rousskov) said : #1

This sequence is pretty common during login/registration/splash screens, and I believe it can be implemented using eCAP.

Your adapter-generated initial response would have to encode request ID (or full request details) as hidden form field(s) or form submission URL parameters. When the user submits the form, the adapter receives the request again, looks at the form fields or URL parameters, and rewrites the request to restore the original request headers.

If you use the request ID approach, the adapter would have to store the original request details and then look them up using the request ID extracted from the second request.

In many environments, it is a good idea to redirect the user to some registration/splash/login web site under your control and then have that site redirect the user back to their original destination. This way, there is less impersonation of the original site, and more flexibility with regard to splash page generation. This approach requires sharing request details with the site software, but forwarding an encoded original URL is often sufficient. For example, here is a Location header with a base64-encoded original URL as a CGI parameter:

    Location: http://www.my-splash-site.com/?original-url=aHR0cDovL3d3dy5teS1zcGxhc2gtc2l0ZS5jb20vb3JpZ2luYWwtdXJsCg==

Dajan Zvekic (dajann) said : #2

Thank you for the quick response.
I also had in mind these two ways. So to recapitulate.
I can do it by storing original request with some ID parameter and I can forward all needed information first time to splash site (this will require of me to send redirect URL).

Actually I fancy second option, so only thing I need in adapter is to get full URL from request and to build it in splash site.
How do I get full URL in adapter? I can easily find host in headers, but not the full URL.

Also, if would need to do it with storing original request, how would I store this request? I can configure squid, or I can do that explicitly in adapter?

Alex Rousskov (rousskov) said : #3

Once you have the request message, you can access its Request-URI and Host header values via firstLine() and header() methods. Depending on how your host application is deployed, you may need both to reconstruct the full Request-URI. There is a little bit more info about this at Question #76192 even though that Question deals with RESPMOD.

To store the original request, you will need to implement some kind of request caching inside the adapter. Squid (or any other caching proxy I know of) will not store the request for you and eCAP does not have an API to receive that stored request anyway. Keep in mind that all transactions are independent from HTTP/eCAP point of view. The host application will not help you find the connection between the second HTTP request and the original request stored inside your adapter. Your code will need to do all that work.

Caching/storing messages given to the adapter by the host application is probably not a good idea because those messages may have a transaction scope/lifespan. If you store them, they may become invalid once the first transaction is over. It is safer to either clone the virgin request message (that should be safe, bugs notwithstanding) or create a brand new request message and copy the relevant virgin request parameters there.

It is possible (and, in some aspects, better) to just remember/cache the first request URI and use redirects instead of request rewriting to send the user to the splash site and, later, to the intended origin server site. However, that approach requires you to distinguish the first request from the third request: both will have identical request URI but only the first request should be redirected to the splash site. Cookies may be used to make this distinction.

No matter what you do, splash screens will break some client transactions. Pick your poison, test well, and deploy gradually.

Dajan Zvekic (dajann) said : #4

"To get to the RequestLine::uri() method, you need to dynamic_cast the generic FirstLine pointer or reference to specific RequestLine pointer or reference. FirstLine class does not have a uri() method because only requests have URIs. Similarly, only ResponseLine can deal with response status codes."

Ok, I understend the idea, but when I do something like this:

libecap::RequestLine reqLine = dynamic_cast<libecap::RequestLine &>(hostx->virgin().firstLine());

it says: cannot declare variable ‘reqLine’ to be of abstract type ‘libecap::RequestLine’

Is there specific class that inherits RequestLine? or I am doing something wrong.

Alex Rousskov (rousskov) said : #5

You should declare reqLine as a [constant] reference to libecap::RequestLine:

const ibecap::RequestLine reqLine = dynamic_cast<const libecap::RequestLine &>(hostx->virgin().firstLine());

The host application has a "specific class that inherits RequestLine" but you do not need it and do not want to know what it is. All you need is a pointer or reference to access the request line data.

Alex Rousskov (rousskov) said : #6

Sorry, forgot to add a reference sign:

const ibecap::RequestLine &reqLine = dynamic_cast<const libecap::RequestLine &>(hostx->virgin().firstLine());

Dajan Zvekic (dajann) said : #7

Thank you, your answer solved this problem.

Dajan Zvekic (dajann) said : #8

Thanks Alex Rousskov, that solved my question.

Dajan Zvekic (dajann) said : #9

Hi,

I am now at the second part of originally described use case.

"After clicking continue button user is forwarded on its igoogle page with all cookies and everything like interruption never happened."

What I did by now is:
When first time in adapter, I set cookie which tells me later that user was in adapter.

Problem is when request is second time in adapter. I can easily use another adopted body, but now I want to forward user to the actual requested page and there is where I get an error:

"The following error was encountered while trying to retrieve the URL: http://www.index.hr/
ICAP protocol error.
The system returned: (100003) Unknown error 100003
This means that some aspect of the ICAP communication failed.
Some possible problems are:
    * The ICAP server is not reachable.
    * An Illegal response was received from the ICAP server."

I have code like this:

void Adapter::Xaction::start() {
 Must(hostx);
 if (hostx->virgin().body()) {
  receivingVb = opOn;
  hostx->vbMake(); //vbDiscard().. I don't need a body..
 } else {
  // we are not interested in vb if there is not one
  receivingVb = opNever;
 }

//check if adapter is visited

if (!visited) {
//do things
}else{
//here i used adapter passthru code

  libecap::shared_ptr<libecap::Message> adapted = hostx->virgin().clone();
  Must(adapted != 0);
  if (!adapted->body()) {
   sendingAb = opNever; // there is nothing to send
   lastHostCall()->useAdapted(adapted);
  } else {
   hostx->useAdapted(adapted);
  }

}

and similar thing I did in abMake method, but it is simply not working.
Do you have an idea why?

Alex Rousskov (rousskov) said : #10

You will probably need to study Squid general log (cache.log) with debug_options set to ALL,9 to see why Squid is complaining.

Dajan Zvekic (dajann) said : #11

Hi,

Here is cache log for this use case:

user requested: www.facebook.com
splash site is shown
user clicks "continue" which should lead user on: www.facebook.com

I uploaded cache.log here: http://www.2shared.com/file/1t5vbWda/cache.html

Also, I tried to see in there what is wrong but did not understand a lot, request from browser looks fine to me (after continue button is clicked). I would be thankful if you could pinpoint what is wrong.

Idea of what I am doing is not wrong? This can be done in way I did it? Maybe I can have two separate adapters, one for first entrance (modifying_adapter) and one for second entrance (passthru_adapter)? But how to say then to squid when which adapter to use?

Dajan Zvekic (dajann) said : #12

What I found out is that problem is within virgin message of second request. More precise I think problem is with virgin body. For some reason proxy can not get in second request. I am a bit confused, because request that browser sends is OK, I tested it without proxy and I get page without any errors. Note that, there is no problem when I use passthru code in first request.

Dajan Zvekic (dajann) said : #13

I would really appreciate if you have any idea to share with me about this problem. Thank you in advance.

Alex Rousskov (rousskov) said : #14

This does not look good:

2011/02/22 10:23:18.303| ../../src/base/AsyncJobCalls.h(178) dial: AsyncJob::start threw exception: std::bad_cast
2011/02/22 10:23:18.303| Adaptation::Ecap::XactionRep will stop, reason: exception

This tells us that a standard exception was thrown when somebody tried to cast Foo to Bar and Foor is not Bar. Do you have any casts in your adapter transaction start() method or the code it calls? I think this would be a dynamic_cast to a reference but there may be other cases as well. This could be a Squid bug too, of course.

If you cannot find the failing cast, load Squid in gdb, put a breakpoint in Adaptation::Ecap::XactionRep::start(), start Squid with -NCd1 command line options (at least), reproduce the problem, and step through that method, including stepping through your transaction's start method if it gets to that far. You should hit a spot where a bad cast throws an exception. If the problem happens during the second transaction, you can type "continue" when the first transaction hits the breakpoint and wait for the second hit.

This may not be the primary cause of your problems, but it is a red flag and a good starting point for an investigation.

I also recommend adding debugging to your adapter. Otherwise, the log goes silent when Squid calls your code, with the exception of somewhat confusing splashes of activity when the adapter calls Squid back.

Dajan Zvekic (dajann) said : #15

I finally got to the bottom of this, problem was pretty stupid.
Actually I needed kind of global variable on adapter to save if user already visited adapter. I added that variable in Xaction class, never realizing that Xaction object is destroyed and created every time squid gets out and into adapter. So I just added this variable as global on namespace Adapter.