Leveraging Transverse Reads to Correct Alignment Faults in Domain Wall Memories Proc of the International Conference on Dependable Systems and Networks (DSN), Portland, OR (June 2019) S. Ollivier, D. Kline, R. Kawsher, R. Melhem, S. Bhanja and A. Jones | [google] |
Optimal Placement of In-memory Checkpoints under Heterogeneous Failure Likelihood Proc. of the Int. Conf. on Parallel and Distributed Processing (IPDPS), Rio de Janeiro, Brazil (May 2019) Z. Hussain, T. Znati and R. Melhem | [google] |
CoLoR: Co-Located Rescuers for Fault Tolerance in HPC Systems Proc. of the International Conference on Parallel and Distributed Systems (ICPADS), Sentosa, Singapore (December 2018) Z. Hussain, X. Cui, T. Znati and R. Melhem | [google] |
Improving Sustainability Through Disturbance Crosstalk Mitigation in Deeply Scaled Phase-change Memory Proc. of the International Green and Sustainable Computing Conference (IGSC), Pittsburgh, PA. (October 2018) S. Seyedzadeh, A. Jones and R. Melhem | [google] |
Partial Redundancy in HPC Systems with Non-Uniform Node Reliabilities Proc. of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC’18), Dallas, TX (November 2018). Z. Hussain, T. Znati and R. Melhem | [google] |
Mitigating Word-line Crosstalk using Adaptive Trees of counters Proc. of the International Symposium on Computer Architecture (ISCA), Loa Anglos, CA (June 2018). S. Seyedzadeh, A. Jones and R. Melhem | [google] |
A systematic Fault-tolerant Computational Model for Both Crash Failures and Silent Data Corruption Proc. of the 21st Conference on Innovation in Clouds, Internet and Networks (ICIN), Paris, France (February 2018). X. Cui, Z. Hussain, T. Znati and R. Melhem | [google] |
Rejuvenating Shadows: Fault Tolerance with Forward Recovery Proc. of the Int. Conf. on High Performance Computing and Communications (HPCC), Bangkok, Thailand, (December 2017). X. Cui, T. Znati and R. Melhem | [google] |
Yoda: Judge me by my size, do you? Proc. of the International Conference on Computer Design (ICCD), Boston, MA (November 2017). J. Zhang, D. Kline Jr., L. Fang, R. Melhem and A. Jones | [google] |
Dynamic Partitioning to Mitigate Stuck-at Faults in Emerging Memories Proc. of the International Conference on Computer Aided Design (ICCAD), Irvine, CA. (November 2017). J. Zhang, D. Kline Jr., L. Fang, R. Melhem and A. Jones | [google] |
Holistic Energy Efficient Crosstalk Mitigation in DRAM Proc. of the International Green and Sustainable Computing Conference (IGSC), Orlando, FL. (October 2017). D. Kline, R. Melhem and A. Jones | [google] |
Sustainable Fault Management and Error Correction for Next-Generation Main Memories Proc. of the International Green and Sustainable Computing Conference (IGSC), Orlando, FL. (October 2017). D. Kline, R. Melhem and A. Jones | [google] |
Mitigating Bitline Crosstalk noise in DRAM Memories Proc. of the International Symposium on Memory Systems (MEMSYS), Washington, DC (October 2017). S. Seyedzadeh, D. Kline, A. Jones and R. Melhem | [google] |
Harvesting Underutilized Resources to Improve Responsiveness and Tolerance to Crash and Silent Faults for Data-intensive Applications Proc. of the International Conference on Cloud Computing (IEEE CLOUD), Honolulu, HI, (June 2017). D. Ganguly, M. Mofrad ,T. Znati, R. Melhem and J. Lange | [google] |
Adaptive and Power-Aware Resilience for Extreme-scale Computing Proc of the 16th IEEE International Conference on Scalable Computing and Communications (ScalCom), Toulouse, France, July 2016. X. Cui, T. Znati and R. Melhem | [google] |
Leveraging ECC to Mitigate Read Disturbance, False Reads and Write Faults in STT-RAM Proc of the International Conference on Dependable Systems and Networks (DSN), Toulouse, France, (June 2016). S. Seyedzadeh, R. Maddah, A. Jones and R. Melhem | [google] |
Energy Consumption of Resilience Mechanisms in Large Scale Systems Proc. of the 22nd Euromicro Int. Conference on Parallel, Distributed, and Network-Based Processing (PDP), Turin, Italy (February 2014). B. Mills, T. Znati, R. Melhem, K. Ferreira and R. Grant | [google] |
Profit Maximization for Resilient Cloud Computing Proc. of the International Conference on Cloud Computing and Services Science (CLOSER ), Barcelona, Spain (April 2014). X. Cui, B. Mills, T. Znati and R. Melhem | [google] |
Shadow Computing: An Energy-Aware Fault Tolerant Computing Model Proc. of the International Conference on Computing, Networking and Communications (ICNC), Honolulu, HI (February 2014). B. Mills, T. Znati and R. Melhem | [google] |
Energy-aware Checkpointing of Divisible Tasks with Soft and Hard Deadlines Proc. of the fourth International Green Computing Conference (IGCC), Arlington, VA (June 2013). G. Aupy, A. Benoit, R. Melhem, P. Renaud-Goud and Y. Robert | [google] |
Power of One Bit: Increasing Error Correction Capability with Data Inversion Proc. of the Pacific Rim International Symposium on Dependable Computing (PRDC), Vancouver, Canada (December 2013). R. Maddah, S. Cho and R. Melhem | [google] |
RDIS: A Recursively Defined Invertible Set Scheme to Tolerate Multiple Stuck-At Faults in Resistive Memory Proc. of the 42nd IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Boston, MA (June 2012). R. Melhem, R. Maddah and S. Cho | [google] |
Process Variation Tolerant Design for Nanophotonic Networks Proc. of The International Symposium on Computer Architecture (ISCA), Portland, OR (June 2012). Y. Xu, J. Yang and R. Melhem | [google] |
Considering Link Qualities in Fault Tolerant Aggregation in Wireless Sensor Networks Proc. of the IEEE Global Telecommunications Conference (Globecom’09), Honolulu, HI (December 2009) S. Gobriel, S. Khattab, D. Mosse and R. Melhem | [google] |
RideSharing: Fault Tolerant Aggregation in Sensor Networks Using Corrective Actions Proc. of the Third Annual IEEE Communications Society Conference on Sensor, Mesh, and Ad Hoc Communications and Networks (SECON), Reston, VA (September 2006) S. Gobriel, S. Khattab, D. Mosse J. Brustoloni and R. Melhem | [ps/pdf] |
The Effects of Energy Management on Reliability in Real-time Embedded Systems Proc. of the International Conference on Computered Aided Design (ICCAD), San Jose, CA (Nov. 2004) D. Zhu, R. Melhem, and D. Mosse | [ps/pdf] |
Energy-Efficient Duplex and TMR Real-Time Systems Proc. of the Real-time System Symposium RTSS, Austin, TX (Dec. 2002) E. Elnozahy, R. Melhem and D. Mosse | [ps/pdf] |
Power Aware Scheduling for AND/OR Graphs in Multi-Processor Real-Time Systems Proc. of the International Conference on Parallel Processing (ICPP), Vancouver, B.C. (Aug. 2002) D. Zhu, N. AbouGhazaleh, D. Mosse and R. Melhem | [ps/pdf] |
Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics, Proc the 12th Euromicro Conference on Real-time Systems, Delft, The Netherlands (June 2001) H. Aydin, R. Melhem, D. Mosse and P. Mejia-Alvarez | [ps/pdf] |
Optimal Scheduling of Imprecise Computation Tasks in the Presence of Multiple Faults Proc. of the Real-Time Computing Systems and Applications Sypm., Cheju, Korea, (Dec. 2000) H. Aydin, D. Mosse, and R. Melhem | [ps/pdf] |
Tolerating Faults while Maximizing Reward Proc. the 12th Euromicro Conference on Real-time Systems, Stockholm, Sweden (June 2000) H. Aydin, R. Melhem and D. Mosse | [ps/pdf] |
Scheduling Optional Computations in Fault-Tolerant Real-Time Systems Proc. of the Real-Time Computing Systems and Applications Sypm., Cheju, Korea, (Dec. 2000) P. Mejia Alvarez, H. Aydin, D. Mosse, and R. Melhem | [ps/pdf] |
Reducing Message Overhead in TMR Systems Proc. of the IEEE International Conference on Distributed Computing Systems (ICDCS ?99), Dallas, TX (June 1999) J. Ramirez and R. Melhem | [ps/pdf] |
Implementation of a Transient Fault-tolerance Scheme on DEOS Proc. of The Real-time Technology and Application Symposium, RTAS, Vancouver, Canada (June 1999) L. Dong, R. Melhem, S. Ghosh, W. Heimerdinger and A. Larson | [ps/pdf] |
Global Fault Tolerant Real-Time Scheduling on Multiprocessors Proc. of The 10th IEEE Euromicro Real-Time Workshop, York, UK (June 1999) F. Liberato, S. Lauzac, R. Melhem and D. Mosse | [ps/pdf] |
Incorporating Error Recovery into the Imprecise Computation Model Proc of the International Conference on Real-Time Computing Systems, and Applications, RTCSA ?99, Hong-Kong (Dec. 1999) H. Aydin, R. Melhem and D. Mosse | [ps/pdf] |
Fault Tolerant, Rate Monotonic Scheduling IFIP International Conference on Dependable Computing for Critical Applications - DCCA, Garmisch - Germany (March 1997) S. Ghosh, R. Melhem and D. Mosse | [google] |
Enhancing Real-Time Schedules to Tolerate Transient Faults Proc. of the 16th IEEE Real-Time Systems Symposium, Pisa, Italy, (1995) S. Ghosh, R. Melhem and D. Mosse | [ps/pdf] |
Fault-Tolerant Scheduling on Hard Real-Time Multiprocessor Systems Proc. of the 8th Int. Parallel Processing Symposium, Cancun, Mexico (1994) S. Ghosh, R. Melhem and D. Mosse | [ps/pdf] |
Compiler Assisted Fault Detection for Distributed Memory Systems Proc. of the 1994 Scalable High Performance Computing Conference, Knoxville, TN (1994) C. Gong, R. Melhem and R. Gupta | [ps/pdf] |
Analysis of a Fault-Tolerant Multiprocessor Scheduling Algorithm Proc. of the 24th Fault-Tolerant Computing Symposium, Austin, TX (1994) D. Mosse, R. Melhem and S. Ghosh | [ps/pdf] |
Replicating Statement Execution for Fault Detection on Distributed Memory Multiprocessors Proc. of the 1994 IEEE Workshop on Fault-Tolerant Parallel and Distributed System, College Station, TX (1994) C. Gong, R. Melhem and R. Gupta | [ps/pdf] |
Reconfiguration in Fault Tolerant 3D Meshes Workshop on Defect and Faults Tolerance in VLSI Systems, Montrial, Canada (1994) A. Chandra and R. Melhem | [ps/pdf] |
Efficient Bi-level Reconfiguration Algorithms for Fault Tolerant Arrays IEEE Int. Workshop on Defect and Faults Tolerance in VLSI Systems, Dallas, TX. (1992) R. Liberskind-Hadas, N. Shrivastava, R. Melhem and C. L. Liu | [ps/pdf] |
Routing in Modular Fault Tolerant Multiprocessor Systems Proc. of the 22nd International IEEE Symposium on Fault Tolerant Computing, Boston, MA (1992) M. Alam and R. Melhem | [ps/pdf] |
Reconfiguration of Computational Arrays with Multiple Redundancy Proc. of the International Conference on Parallel Processing, St. Charles, Illinois (1991) R. Melhem and John Ramirez | [ps/pdf] |
Embedding Rings in Hypercubes for Run-time Fault Tolerance Proc. of the Fourth ISMM Conference on Parallel and Distributed Computing and Systems, Washington D.C. (1991) F. Provost and R. Melhem | [google] |
Efficient and Optimal Fault-to-Spare Assignment in Doubly Fault Tolerant Arrays Proc. of the IEEE Int. Workshop on Defect and Faults Tolerance in VLSI Systems, Hidden Valley, PA. (1991) N. Shrivastava and R. Melhem | [ps/pdf] |
Meshes with Flexible Redundancy Proc. of the Second Workshop on Algorithms and Parallel VLSI Architectures, Bonas, France, (1991) R. Melhem and J. Ramirez | [google] |
Channel Multiplexing in Modular Fault Tolerant Multiprocessors Proc. of the International Conference on Parallel Processing, St. Charles, Illinois (1991) M. Alam and R. Melhem | [ps/pdf] |
How to use an Incomplete Hypercube for Fault Tolerance Proc. of the first European Workshop on Hypercube and Distributed Computers, Rennes, France (1989) M. Alam and R. Melhem | [google] |
Fault Tolerance and Reliable Routing in Augmented Hypercube Architectures Proc. of the 8th. IEEE Phoenix Conference on Computers and Communications, Phoenix, AZ (1989) M. Alam and R. Melhem | [ps/pdf] |
Bi-Level Reconfigurations of Fault Tolerant Arrays in Bi-modal Computational Environments Proc. of the 19th. International IEEE Symposium on Fault Tolerant Computing Chicago, IL (1989) R. Melhem | [ps/pdf] |
Fault Tolerant Embedding of Binary Trees and Rings into Hypercubes Proc. of the International Workshop on Defect and Fault Tolerance in VLSI Systems, Springfield, MA (1988) F. Provost and R. Melhem | [google] |