adapter_modifying does not modify bing.com

Asked by Phoenix Wang on 2016-10-19

I'm using squid as a proxy and load ecap in its conf, I leave all things as default in squid.conf and add codes for ecap:
loadable_modules /usr/local/lib/ecap_adapter_modifying.so
ecap_enable on
ecap_service ecapModifier respmod_precache \
        uri=ecap://e-cap.org/ecap/services/sample/modifying \
        victim=</body> \
        replacement=aaaaaaaaaaaaaaaaaaaaaaaaaa</body>
adaptation_access ecapModifier allow all

However, when I open www.bing.com, I don't see the replacement working. Any one knows why?
(P.S. I'm sure squid is working)

Question information

Language:
English Edit question
Status:
Answered
For:
eCAP Edit question
Assignee:
No assignee Edit question
Last query:
2016-10-25
Last reply:
2016-10-28
Alex Rousskov (rousskov) said : #1

Please disable Squid disk and memory caches. Use wget, curl, or a similar single-request command-line tool. Do you see the expected adaptation?

Phoenix Wang (phx350z) said : #2

My conf file for squid cache is already disabled:
coredump_dir deny all
cache deny all
but I still don't see the adapter working on bing.com.

Alex Rousskov (rousskov) said : #3

What tool do you use to access bing.com during this test?

Phoenix Wang (phx350z) said : #4

I'm using an extension for chrome named, SwitchyOmega.

Alex Rousskov (rousskov) said : #5

For initial tests, please use wget, curl, or a similar single-request command-line tool. Do you see the expected adaptation?

Gerald Song (307419612-v) said : #6

Hi Alex,
We've tried curl and wget:
[my-centos ~]# curl -x http://10.xxx.xxx.xxx:3128 -L http://www.bing.com
[my-centos ~]# wget http://www.bing.com -e use_proxy=yes -e http_proxy=10.xxx.xxx.xxx:3128
[my-centos ~]# ...
</body> are both just modified as expected.

However, if we try with pytho2.7:
>>> import requests
>>> requests.get('http://www.bing.com', proxies={'http': '10.xxx.xxx.xxx:3128'}).text
...
</body> is NOT modified as expected.
We've check /var/logs/cache.log, the request is surely passed the proxy.

Any ideas why this happened?

Alex Rousskov (rousskov) said : #7

> Any ideas why this happened?

Compare the actual HTTP response headers received by Squid in a wget and python tests. Perhaps the response for the python client was gzip-encoded because the python client claims to accept that encoding? If you cannot figure it out, post the complete packet capture (from both Squid sides, in both directions).

Gerald Song (307419612-v) said : #8

I've upload the pcap and final html file to https://drive.google.com/open?id=0B9Z5D1oCUMY5LUI0SDQxdVdQT00

Alex Rousskov (rousskov) said : #9

Yes, it is gzip encoding, as I suspected:

Python (response is gzip-encoded):

GET / HTTP/1.1
Accept-Encoding: gzip, deflate
...

HTTP/1.1 200 OK
Content-Length: 44337
Content-Type: text/html; charset=utf-8
Content-Encoding: gzip
...

Curl (plain response without a Content-Encoding header):

GET / HTTP/1.1
Accept: */*
...

HTTP/1.1 200 OK
Content-Length: 120242
Content-Type: text/html; charset=utf-8
...

If you configure Curl to send "Accept-Encoding: gzip, deflate" instead of "Accept: */*", then, I bet, you will get the same compressed response. Compressed responses are very common in real traffic.

The sample modifying adapter does not support compressed content. Production adapters do (or should). For this and other HTML injection caveats please see https://answers.launchpad.net/ecap/+faq/1793

Can you help with this problem?

Provide an answer of your own, or ask Phoenix Wang for more information if necessary.

To post a message you must log in.