MaraDNS troubleshooting guide This troubleshooting guide is an example of how problems with MaraDNS may be resolved without needing to wait for support on the MaraDNS mailing list. This guide is a troubleshooting example that uses MaraDNS on a CentOS 3.8 system. People using MaraDNS on other systems will have to adapt this guide to suit their needs. The problem we will troubleshoot in this example is MaraDNS not responding to DNS queries. As we will see in this guide, a number of different issues can cause this problem, and resolving the problem depends on what issue is causing the problem. As just some of the possible issues, it is possible that the MaraDNS process is not running at all. It's possible that MaraDNS is running, but can't bind to the assigned IP (because of a Linux bug, MaraDNS can not accurately report this problem when run in Linux). Here are some hints: * Use the askmara client, not dig, not host, not nslookup, and not djbdns' DNS lookup thingy to perform DNS queries. Why? Because this document shows you what askmara's replies are when sending DNS queries. * Keep the mararc short and simple while troubleshooting: ipv4_bind_addresses = "127.0.0.1" chroot_dir = "/etc/maradns" recursive_acl = "127.0.0.1/8" In the above mararc file, MaraDNS has the IP 127.0.0.1, would look for zone files in the directory /etc/maradns, and allows recursive DNS queries on the loopback interface. OK, so let's look at some problems, as they appear on a CentOS 3.8 box with the above mararc file. This is how things look when we don't have a loopback interface to bind to. Like in all examples in this guide, the '$' character indicates a line that we type data on; all other lines, including lines that start with '#', are lines created by the programs we are running in these examples. $ askmara Awww.google.com. # Querying the server with the IP 127.0.0.1 # Hard Error: Unable to send UDP packet! Basically, the askmara client is unable to send a query because there is no way for it to contact a server on 127.0.0.1. Probably because there is no 127.0.0.1 to send the packet on. So, let's start troubleshooting. $ export PATH=$PATH:/sbin:/usr/sbin:/usr/local/sbin This gives us access to commands like ifconfig and what not. $ su Password: type in your root password here $ ifconfig lo 127.0.0.1 $ askmara Awww.google.com. # Querying the server with the IP 127.0.0.1 # Hard Error: Timeout OK, so let's restart MaraDNS: $ /etc/rc.d/init.d/maradns restart Sending all MaraDNS processes the TERM signal waiting 1 second Sending all MaraDNS processes the KILL signal MaraDNS should have been stopped Starting all maradns processes Starting maradns process which uses Mararc file /etc/mararc If /etc/rc.d/init.d/maradns restart doesn't generate the above output, this indicates that either MaraDNS was not correctly installed, or that you are using MaraDNS on another Linux/*NIX distribution. If you're not using CentOS or Red Hat Enterprise Linux, replace this command with the appropriate command for restarting a daemon/service for your operating system. Now, lets look at some possible replies. Server failure $ askmara Awww.google.com. # Querying the server with the IP 127.0.0.1 # Remote server said: SERVER FAILURE # Question: Awww.google.com. # NS replies: # AR replies: This is the askmara output when MaraDNS is running correctly but is unable to connect to DNS servers on the internet. This can be caused when the machine running MaraDNS does not have an internet connection, or when MaraDNS is being firewalled. So, we get the internet connection up and going. If you have a working ethernet card and are on a network with internet access, this is as simple as making a DHCP request for an IP: $ dhclient Internet Systems Consortium DHCP Client V3.0.1 Copyright 2004 Internet Systems Consortium. All rights reserved. For info, please visit http://www.isc.org/products/DHCP /sbin/dhclient-script: configuration for eth0 not found. Continuing with defaults. /sbin/dhclient-script: line 52: eth0: No existe el fichero o el directorio Listening on LPF/eth0/00:40:f4:17:ac:e9 Sending on LPF/eth0/00:40:f4:17:ac:e9 Listening on LPF/lo/ Sending on LPF/lo/ Sending on Socket/fallback DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 6 DHCPOFFER from 10.1.2.1 DHCPREQUEST on eth0 to 255.255.255.255 port 67 DHCPACK from 10.1.2.1 /sbin/dhclient-script: configuration for eth0 not found. Continuing with defaults. /sbin/dhclient-script: line 52: eth0: No existe el fichero o el directorio bound to 10.1.2.3 -- renewal in 255 seconds. Note that if you are using something besides CentOS or Red Hat Enterprise Linux, the command for getting a DHCP lease may not be dhclient. Now, the dhclient that CentOS 3.8 comes with is buggy, and breaks lo (the loopback interface which gives CentOS the 127.0.0.1 IP address). So, we have to fix lo again: $ ifconfig lo 127.0.0.1 In addition, losing 127.0.0.1 breaks any service bound to 127.0.0.1, such as MaraDNS, so we have to rebind MaraDNS to 127.0.0.1: $ /etc/rc.d/init.d/maradns restart Sending all MaraDNS processes the TERM signal waiting 1 second Sending all MaraDNS processes the KILL signal MaraDNS should have been stopped Starting all maradns processes Starting maradns process which uses Mararc file /etc/mararc Keep in mind that MaraDNS binds to high-numbered ports when sending outgoing DNS requests. The "Firewall Configuration" section of the MaraDNS man page gives details. The problem with UNIX firewalls is that there is no standard interface for configuring them, so I can't help you as well as I would like here. CentOS 3.8, by default, has a firewall that allows MaraDNS to act as a recursive nameserver on the loopback (127.0.0.1) interface, but the firewall needs to be changed to work on other interfaces: $ redhat-config-securitylevel-tui And select "personalize", and add "53:udp" as a hole in the firewall. Yes, the interface for this program is somewhat primitive; hopefully CentOS 4 has a more complete interface. You will have to do a similiar configuration change to any firewalls between your server and the internet. Timeout It is also possible to get a timeout after sending an askmara query. A timeout looks like this: $ askmara Awww.google.com. # Querying the server with the IP 127.0.0.1 At this point, there is a 30 second delay. After the delay, askmara outputs this message: # Hard Error: Timeout This is usually caused by one of two problems: * Maradns is not correctly running * A firewall is stopping DNS packets from being sent over the loopback interface To see if MaraDNS is running, run ps like this: $ ps auxw | grep maradns If MaraDNS is running, the output will look like this: root 2023 0.0 0.0 1516 304 pts/1 S 11:46 0:00 /usr/bin/duende /usr/sbin/maradns -f /etc/mararc nobody 2024 0.3 0.1 1748 596 pts/1 S 11:46 0:00 /usr/sbin/maradns -f /etc/mararc #66 2025 0.0 0.0 1520 440 pts/1 S 11:46 0:00 /usr/bin/duende /usr/sbin/maradns -f /etc/mararc user 2027 0.0 0.1 3720 700 pts/1 S 11:46 0:00 grep maradns If MaraDNS is not running, the output will look like this: user 1983 0.0 0.1 3728 696 pts/1 S 11:45 0:00 grep maradns If MaraDNS is not running, there may be a message in the log files indicating why MaraDNS failed when you tried to start MaraDNS. Look at the log: $ su Password: $ grep maradns /var/log/messages | more The messages will give you a hint as to what is preventing MaraDNS from starting up. If there are no MaraDNS messages in your log, there is something wrong with your MaraDNS installation. Conclusion Basically, the best strategy for troubleshooting problems with MarNS is to have the mararc file be a simple three line mararc file. If things still don't work, the problem is probably outside of MaraDNS.