DNS resolving issues with/without local domain

Asked by Olaf Kock

We are operating a network consisting of several Ubuntu 8.04 LTS servers and mixed clients (Mac, Win, Linux, Kubuntu 8.04 among them).

Recently I've set up a dns nameserver (bind9 from ubuntu repositories) that's operating almost perfectly, just not from a Kubuntu 8.04 Desktop as well as from some Ubuntu 8.04 servers.

From my Kubuntu 8.04 Desktop, as well as from another Ubuntu 8.04 server, I have weird problems resolving /some/ addresses, depending on my giving the domain part or not. The result depends upon usage of tools like dig or nslookup (working) or ping, firefox etc. (not working - note: it's just the name resolution that's not working)

The result is as follows (note that ab.local is the local domain, served by this dns server, as well as the domain set as default in /etc/resolv.conf):
  dig a1
     (works, address is resolving)
  dig a1.ab.local
     (works, address is resolving)
  nslookup a1
     (works, address is resolving)
  nslookup a1.ab.local
     (works, address is resolving)
  ping a1
     (works, address is resolving)
  ping a1.ab.local
     (doesn't work - "ping: unknown host a1.ab.local")

Note: MacOS has no problems resolving either way, same with Windows. I definitely know that there has been a time where my problems where solved, because this problem disappeared some time ago, but reappeared today.

my /etc/resolv.conf contains just this one sole (not yet replicated) nameserver:
  search ab.local
  nameserver 10.2.0.2

I've also added a line "domain ab.local", but this frequently gets overwritten by the dhcp server. It doesn't work though before being overwritten.

I'm puzzled where else to look or what search terms to use.

----------
Information overkill: I believe that above this line is everything one needs to know. If it makes a difference (I doubt, but just in case), here's more information about the real network setup.

There's 4 networks, 10.1.0.x, 10.2.0.x, 10.3.0.x and 192.168.1.x . The Nameserver resides on 10.2.0.2 and is distributed through dhcp to the rest of the network. There's forward resolution for all local networks to be in the same domain. Our local domain is named "ab.local", the hosts are - depending on the network - canonically named a1.ab.local through a255.ab.local, b1.ab.local etc. (additionally there's some CNAME records to name servers by function, but I guess this doesn't contribute to the problem). Reverse lookups for each of these networks just resolve to the numerical names, not the CNAMEs). There's one file containing forward lookups and 4 files containing reverse lookups.

Let me know if I need to provide more information or the bind config files. I have full access to all these machines.

Thanks
Olaf

Question information

Language:
English Edit question
Status:
Solved
For:
Ubuntu Edit question
Assignee:
No assignee Edit question
Solved by:
Andy Ruddock
Solved:
Last query:
Last reply:
Revision history for this message
Best Andy Ruddock (andy-ruddock) said :
#1

If you run

sudo ltrace -o ping.ltrace -f ping -c 4 a1.ab.local

Then you'll get a file "ping.ltrace" which contains the library calls made, that may give a clue.
You can also do the same with strace to get the system calls made.
Another thing is to use wireshark to capture any network traffic, you may some clue from there.

Revision history for this message
Olaf Kock (okock) said :
#2

Thanks Andy Ruddock, that solved my question.

Revision history for this message
Olaf Kock (okock) said :
#3

Hi Andy,

praise praise. Thanks a lot. strace did the trick and showed that /etc/nsswitch.conf was involved, containing the line

    hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4

upon placing "dns" before [NOTFOUND=return] everything works as expected. The rest is to read the docs to understand what I've done and if I want /this/ solution or a different. At least the problem has been nailed down.