Universidade Federal do Rio de Janeiro COPPE
Instituto Alberto Luiz Coimbra de PósGraduação e Pesquisa de Engenharia
Instituto de Matemática




Data: Thursday, July 19, 2007 At 14:00
Duração: 2 Horas


Palestras: Esther Pacitti e Patrick Valduriez
Projeto de cooperacao DAAD, financiado pela Capes/Cofecub. Data: 19 de julho 2007 (5a.feira) Hora: 14:00h e 15:00h Local: Sala H318b (COPPESistemas) Centro de Tecnologia, bloco H, 3ro andar. Data Currency in Replicated DHTs* Esther Pacitti INRIA and LINA, France Distributed Hash Tables (DHTs) provide a scalable solution for data sharing in P2P systems. To ensure high data availability, DHTs typically rely on data replication, yet without data currency guarantees. Supporting data currency in replicated DHTs is difficult as it requires the ability to return a current replica despite peers leaving the network or concurrent updates. In this paper, we give a complete solution to this problem. We propose an Update Management Service (UMS) to deal with data availability and efficient retrieval of current replicas based on timestamping. For generating timestamps, we propose a Keybased Timestamping Service (KTS) which performs distributed timestamp generation using local counters. Through probabilistic analysis, we compute the expected number of replicas which UMS must retrieve for finding a current replica. Except for the cases where the availability of current replicas is very low, the expected number of retrieved replicas is typically small, /e.g./ if at least 35% of current replicas are available then the expected number of retrieved replicas is less than 3. We validated our solution through implementation and experimentation over a 64node cluster and evaluated its scalability through simulation up to 10,000 peers using SimJava. The results show the effectiveness of our solution. They also show that our algorithm used in UMS achieves major performance gains, in terms of response time and communication cost, compared with a baseline algorithm. *Joint work with Reza Akbarinia and Patrick Valduriez, SIGMOD2007.  Best Position Algorithms for Topk Queries** Patrick Valduriez INRIA and LINA, France The general problem of answering topk queries can be modeled using lists of data items sorted by their local scores. The most efficient algorithm proposed so far for answering topk queries over sorted lists is the Threshold Algorithm (TA). However, TA may still incur a lot of useless accesses to the lists. In this paper, we propose two new algorithms which stop much sooner. First, we propose the best position algorithm (BPA) which executes topk queries more efficiently than TA. For any database instance (/i.e./ set of sorted lists), we prove that BPA stops as early as TA, and that its execution cost is never higher than TA. We show that the position at which BPA stops can be /(m1)/ times lower than that of TA, where /m/ is the number of lists. We also show that the execution cost of our algorithm can be /(m1)/ times lower than that of TA. Second, we propose the BPA2 algorithm which is much more efficient than BPA. We show that the number of accesses to the lists done by BPA2 can be about /(m1)/ times lower than that of BPA. Our performance evaluation shows that over our test databases, BPA and BPA2 outperform TA by a factor of about /(m+6)/8/ and /(m+1)/2/ respectively, /e.g./ for /m/=10, the factor is about 2 and 5.5, respectively. ** Joint work with Reza Akbarinia and Esther Pacitti, VLDB2007 (to appear). 


